pandas 计算pandas中每行具有某些值的列数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/44717137/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:52:02  来源:igfitidea点击:

Count number of columns with some values for each row in pandas

pythonpandasdataframe

提问by jovicbg

I have dataframe like this, data:

我有这样的数据框,数据:

Site code    Col1  Col2  Col3
A5252        24    53     NaN
A5636        36    NaN    NaN
A4366        NaN   NaN    NaN
A7578        42    785    24

And I want to count a number of columns with some value, but none NaN. Desired output:

我想计算一些具有某些值的列,但没有 NaN。期望的输出:

 Site code   Col1  Col2  Col3  Count
    A5252     24    53     NaN    2
    A5636     36    NaN    NaN    1
    A4366     NaN   NaN    NaN    0
    A7578     42    785    24     3

Something oposite to this: df = data.isnull().sum(axis=1)

与此相反的东西:df = data.isnull().sum(axis=1)

回答by jezrael

Need change isnullto notnull:

需要更改isnullnotnull

#if first columns is not index, set it
data = data.set_index('Site code')
data['Count'] = data.notnull().sum(axis=1)

Or use function DataFrame.count:

或使用功能DataFrame.count

data = data.set_index('Site code')
data['Count'] = data.count(axis=1)
print (data)
           Col1   Col2  Col3  Count
Site code                          
A5252      24.0   53.0   NaN      2
A5636      36.0    NaN   NaN      1
A4366       NaN    NaN   NaN      0
A7578      42.0  785.0  24.0      3

Another solution with selecting columns by loc(Site codeis column, not index):

通过loc(Site code是 column, not index)选择列的另一种解决方案:

print (data.loc[:, 'Col1':])
   Col1   Col2  Col3
0  24.0   53.0   NaN
1  36.0    NaN   NaN
2   NaN    NaN   NaN
3  42.0  785.0  24.0

data['Count'] = data.loc[:, 'Col1':].count(axis=1)
print (data)
  Site code  Col1   Col2  Col3  Count
0     A5252  24.0   53.0   NaN      2
1     A5636  36.0    NaN   NaN      1
2     A4366   NaN    NaN   NaN      0
3     A7578  42.0  785.0  24.0      3

Another nice idea from Jon Clements- use filter:

Jon Clements 的另一个好主意- 使用filter

data['Count'] = data.filter(regex="^Col").count(axis=1)
print (data)

  Site code  Col1   Col2  Col3  Count
0     A5252  24.0   53.0   NaN      2
1     A5636  36.0    NaN   NaN      1
2     A4366   NaN    NaN   NaN      0
3     A7578  42.0  785.0  24.0      3

回答by void

Simple use notnull()

使用简单 notnull()

import pandas as pd
df = pd.read_csv("your_csv.csv")

df['count'] = df.notnull().sum(axis=1)

print(df)

Also to add a column to a dataframe just use:

还要向数据框添加一列,只需使用:

df['new_column_name'] = newcolumn

output:

输出:

Site code   Col1  Col 2  Col3  count
    A5252     24    53     NaN    2
    A5636     36    NaN    NaN    1
    A4366     NaN   NaN    NaN    0
    A7578     42    785    24     3