获取 Pandas DataFrame 中每行非零值的计数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/46757603/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:38:27  来源:igfitidea点击:

Get count of non zero values per row in Pandas DataFrame

pythonpandas

提问by Coding hierarchy

I know this is a simple question, but I'm very new to Pandas. I want to compare for each row the cells to see if any of the cells in the columns are more or less than 0.00.

我知道这是一个简单的问题,但我对 Pandas 很陌生。我想比较每一行的单元格,看看列中的任何单元格是否大于或小于 0.00。

              GOOG    AAPL     XOM     IBM       Value
2011-01-10     0.0     0.0     0.0     0.0       0.00
2011-01-13     0.0 -1500.0     0.0  4000.0  -61900.00

I know that pandas have built in the iterrows. However, with the following piece of code I'm receiving an error

我知道Pandas已经建在 iterrows 中。但是,使用以下代码段我收到错误

for index, row in dataFrame.iterrows():
    for i in range(0, len(of_columns)):
        print dataFrame[index][i]

Error

错误

return self._engine.get_loc(self._maybe_cast_indexer(key)) File "pandas\index.pyx", line 132, in pandas.index.IndexEngine.get_loc (pandas\index.c:4433) File "pandas\index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas\index.c:4279) File "pandas\src\hashtable_class_helper.pxi", line 732, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:13742) File "pandas\src\hashtable_class_helper.pxi", line 740, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:13696)

return self._engine.get_loc(self._maybe_cast_indexer(key)) 文件“pandas\index.pyx”,第 132 行,在 pandas.index.IndexEngine.get_loc (pandas\index.c:4433) 文件“pandas\index.pyx ",第 154 行,在 pandas.index.IndexEngine.get_loc (pandas\index.c:4279) 文件“pandas\src\hashtable_class_helper.pxi”,第 732 行,在 pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c :13742) 文件“pandas\src\hashtable_class_helper.pxi”,第 740 行,在 pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:13696)

Intended actionIf the cell contains 0, do nothing (continue). If the cell contains other than zero, then give count of non zero values per row

预期操作如果单元格包含 0,则什么都不做(继续)。如果单元格包含非零,则给出每行非零值的计数

回答by jezrael

Compare by gt(>), lt(<) or le, ge, ne, eqfirst and then sumTrues, there are processing like 1:

比较gt( >), lt( <) 或le, ge, ne, eq先比较sumTrues,有如下处理1

Bad -> check all previous columns:

坏 -> 检查所有以前的列:

df['> zero'] = df.gt(0).sum(axis=1)
df['< zero'] = df.lt(0).sum(axis=1)
df['== zero'] = df.eq(0).sum(axis=1)
print (df)
            GOOG    AAPL  XOM     IBM    Value  > zero  < zero  == zero
2011-01-10   0.0     0.0  0.0     0.0      0.0       0       0        7
2011-01-13   0.0 -1500.0  0.0  4000.0 -61900.0       1       2        2

Correct - select columns for check:

正确 - 选择要检查的列:

cols = df.columns
df['> zero'] = df[cols].gt(0).sum(axis=1)
df['< zero'] = df[cols].lt(0).sum(axis=1)
df['== zero'] = df[cols].eq(0).sum(axis=1)
print (df)
            GOOG    AAPL  XOM     IBM    Value  > zero  < zero  == zero
2011-01-10   0.0     0.0  0.0     0.0      0.0       0       0        5
2011-01-13   0.0 -1500.0  0.0  4000.0 -61900.0       1       2        2

Detail:

细节:

print (df.gt(0))
             GOOG   AAPL    XOM    IBM  Value
2011-01-10  False  False  False  False  False
2011-01-13  False  False  False   True  False

EDIT:

编辑:

To remove some columns from the 'cols' use difference:

要从 'cols' 中删除一些列,请使用difference

cols = df.columns.difference(['Value'])
print (cols)
Index(['AAPL', 'GOOG', 'IBM', 'XOM'], dtype='object')

df['> zero'] = df[cols].gt(0).sum(axis=1)
df['< zero'] = df[cols].lt(0).sum(axis=1)
df['== zero'] = df[cols].eq(0).sum(axis=1)
print (df)
            GOOG    AAPL  XOM     IBM    Value  > zero  < zero  == zero
2011-01-10   0.0     0.0  0.0     0.0      0.0       0       0        4
2011-01-13   0.0 -1500.0  0.0  4000.0 -61900.0       1       1        2