获取 Pandas DataFrame 中每行非零值的计数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/46757603/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Get count of non zero values per row in Pandas DataFrame
提问by Coding hierarchy
I know this is a simple question, but I'm very new to Pandas. I want to compare for each row the cells to see if any of the cells in the columns are more or less than 0.00.
我知道这是一个简单的问题,但我对 Pandas 很陌生。我想比较每一行的单元格,看看列中的任何单元格是否大于或小于 0.00。
GOOG AAPL XOM IBM Value
2011-01-10 0.0 0.0 0.0 0.0 0.00
2011-01-13 0.0 -1500.0 0.0 4000.0 -61900.00
I know that pandas have built in the iterrows. However, with the following piece of code I'm receiving an error
我知道Pandas已经建在 iterrows 中。但是,使用以下代码段我收到错误
for index, row in dataFrame.iterrows():
for i in range(0, len(of_columns)):
print dataFrame[index][i]
Error
错误
return self._engine.get_loc(self._maybe_cast_indexer(key)) File "pandas\index.pyx", line 132, in pandas.index.IndexEngine.get_loc (pandas\index.c:4433) File "pandas\index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas\index.c:4279) File "pandas\src\hashtable_class_helper.pxi", line 732, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:13742) File "pandas\src\hashtable_class_helper.pxi", line 740, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:13696)
return self._engine.get_loc(self._maybe_cast_indexer(key)) 文件“pandas\index.pyx”,第 132 行,在 pandas.index.IndexEngine.get_loc (pandas\index.c:4433) 文件“pandas\index.pyx ",第 154 行,在 pandas.index.IndexEngine.get_loc (pandas\index.c:4279) 文件“pandas\src\hashtable_class_helper.pxi”,第 732 行,在 pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c :13742) 文件“pandas\src\hashtable_class_helper.pxi”,第 740 行,在 pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:13696)
Intended actionIf the cell contains 0, do nothing (continue). If the cell contains other than zero, then give count of non zero values per row
预期操作如果单元格包含 0,则什么都不做(继续)。如果单元格包含非零,则给出每行非零值的计数
回答by jezrael
Compare by gt
(>
), lt
(<
) or le
,
ge
,
ne
,
eq
first and then sum
True
s, there are processing like 1
:
比较gt
( >
), lt
( <
) 或le
,
ge
,
ne
,
eq
先比较sum
True
s,有如下处理1
:
Bad -> check all previous columns:
坏 -> 检查所有以前的列:
df['> zero'] = df.gt(0).sum(axis=1)
df['< zero'] = df.lt(0).sum(axis=1)
df['== zero'] = df.eq(0).sum(axis=1)
print (df)
GOOG AAPL XOM IBM Value > zero < zero == zero
2011-01-10 0.0 0.0 0.0 0.0 0.0 0 0 7
2011-01-13 0.0 -1500.0 0.0 4000.0 -61900.0 1 2 2
Correct - select columns for check:
正确 - 选择要检查的列:
cols = df.columns
df['> zero'] = df[cols].gt(0).sum(axis=1)
df['< zero'] = df[cols].lt(0).sum(axis=1)
df['== zero'] = df[cols].eq(0).sum(axis=1)
print (df)
GOOG AAPL XOM IBM Value > zero < zero == zero
2011-01-10 0.0 0.0 0.0 0.0 0.0 0 0 5
2011-01-13 0.0 -1500.0 0.0 4000.0 -61900.0 1 2 2
Detail:
细节:
print (df.gt(0))
GOOG AAPL XOM IBM Value
2011-01-10 False False False False False
2011-01-13 False False False True False
EDIT:
编辑:
To remove some columns from the 'cols' use difference
:
要从 'cols' 中删除一些列,请使用difference
:
cols = df.columns.difference(['Value'])
print (cols)
Index(['AAPL', 'GOOG', 'IBM', 'XOM'], dtype='object')
df['> zero'] = df[cols].gt(0).sum(axis=1)
df['< zero'] = df[cols].lt(0).sum(axis=1)
df['== zero'] = df[cols].eq(0).sum(axis=1)
print (df)
GOOG AAPL XOM IBM Value > zero < zero == zero
2011-01-10 0.0 0.0 0.0 0.0 0.0 0 0 4
2011-01-13 0.0 -1500.0 0.0 4000.0 -61900.0 1 1 2