获取 Pandas DataFrame 中每行非零值的计数

Question

提问by Coding hierarchy

I know this is a simple question, but I'm very new to Pandas. I want to compare for each row the cells to see if any of the cells in the columns are more or less than 0.00.

我知道这是一个简单的问题，但我对 Pandas 很陌生。我想比较每一行的单元格，看看列中的任何单元格是否大于或小于 0.00。

              GOOG    AAPL     XOM     IBM       Value
2011-01-10     0.0     0.0     0.0     0.0       0.00
2011-01-13     0.0 -1500.0     0.0  4000.0  -61900.00

I know that pandas have built in the iterrows. However, with the following piece of code I'm receiving an error

我知道Pandas已经建在 iterrows 中。但是，使用以下代码段我收到错误

for index, row in dataFrame.iterrows():
    for i in range(0, len(of_columns)):
        print dataFrame[index][i]

Error

错误

return self._engine.get_loc(self._maybe_cast_indexer(key)) File "pandas\index.pyx", line 132, in pandas.index.IndexEngine.get_loc (pandas\index.c:4433) File "pandas\index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas\index.c:4279) File "pandas\src\hashtable_class_helper.pxi", line 732, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:13742) File "pandas\src\hashtable_class_helper.pxi", line 740, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:13696)

return self._engine.get_loc(self._maybe_cast_indexer(key)) 文件“pandas\index.pyx”，第 132 行，在 pandas.index.IndexEngine.get_loc (pandas\index.c:4433) 文件“pandas\index.pyx "，第 154 行，在 pandas.index.IndexEngine.get_loc (pandas\index.c:4279) 文件“pandas\src\hashtable_class_helper.pxi”，第 732 行，在 pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c :13742) 文件“pandas\src\hashtable_class_helper.pxi”，第 740 行，在 pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:13696)

Intended actionIf the cell contains 0, do nothing (continue). If the cell contains other than zero, then give count of non zero values per row

预期操作如果单元格包含 0，则什么都不做（继续）。如果单元格包含非零，则给出每行非零值的计数

Answer 1

回答by jezrael

Compare by gt(>), lt(<) or le, ge, ne, eqfirst and then sumTrues, there are processing like 1:

比较gt( >), lt( <) 或le, ge, ne, eq先比较sumTrues，有如下处理1：

Bad -> check all previous columns:

坏 -> 检查所有以前的列：

df['> zero'] = df.gt(0).sum(axis=1)
df['< zero'] = df.lt(0).sum(axis=1)
df['== zero'] = df.eq(0).sum(axis=1)
print (df)
            GOOG    AAPL  XOM     IBM    Value  > zero  < zero  == zero
2011-01-10   0.0     0.0  0.0     0.0      0.0       0       0        7
2011-01-13   0.0 -1500.0  0.0  4000.0 -61900.0       1       2        2

Correct - select columns for check:

正确 - 选择要检查的列：

cols = df.columns
df['> zero'] = df[cols].gt(0).sum(axis=1)
df['< zero'] = df[cols].lt(0).sum(axis=1)
df['== zero'] = df[cols].eq(0).sum(axis=1)
print (df)
            GOOG    AAPL  XOM     IBM    Value  > zero  < zero  == zero
2011-01-10   0.0     0.0  0.0     0.0      0.0       0       0        5
2011-01-13   0.0 -1500.0  0.0  4000.0 -61900.0       1       2        2

Detail:

细节：

print (df.gt(0))
             GOOG   AAPL    XOM    IBM  Value
2011-01-10  False  False  False  False  False
2011-01-13  False  False  False   True  False

EDIT:

编辑：

To remove some columns from the 'cols' use difference:

要从 'cols' 中删除一些列，请使用difference：

cols = df.columns.difference(['Value'])
print (cols)
Index(['AAPL', 'GOOG', 'IBM', 'XOM'], dtype='object')

df['> zero'] = df[cols].gt(0).sum(axis=1)
df['< zero'] = df[cols].lt(0).sum(axis=1)
df['== zero'] = df[cols].eq(0).sum(axis=1)
print (df)
            GOOG    AAPL  XOM     IBM    Value  > zero  < zero  == zero
2011-01-10   0.0     0.0  0.0     0.0      0.0       0       0        4
2011-01-13   0.0 -1500.0  0.0  4000.0 -61900.0       1       1        2

获取 Pandas DataFrame 中每行非零值的计数

提问by Coding hierarchy

回答by jezrael

相关推荐

最近更新

标签

获取 Pandas DataFrame 中每行非零值的计数

提问by Coding hierarchy

回答by jezrael

相关推荐

pandas 设置一个字符串作为pandas DataFrame 的索引

pandas 如何获得今天与熊猫的约会？

如何从 Pandas 系列中获取最大值和名称？

Pandas.read_json(JSON_URL)

相关推荐

最近更新

标签