pandas 级别 NaN 必须与名称相同
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/49818031/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Level NaN must be same as name
提问by Ian Dzindo
I am trying to count how many times NaN appears in a column of a dataframe using this code:
我正在尝试使用以下代码计算 NaN 在数据帧的列中出现的次数:
count = enron_df.loc['salary'].count('NaN')
But every time i run this i get the following error:
但是每次我运行它时,我都会收到以下错误:
KeyError: 'Level NaN must be same as name (None)'
I searched around the web a lot trying to find a solution, but to no avail.
我在网上搜索了很多试图找到解决方案,但无济于事。
回答by jezrael
If NaN
s are missing values:
如果NaN
s 是缺失值:
enron_df = pd.DataFrame({'salary':[np.nan, np.nan, 1, 5, 7]})
print (enron_df)
salary
0 NaN
1 NaN
2 1.0
3 5.0
4 7.0
count = enron_df['salary'].isna().sum()
#alternative
#count = enron_df['salary'].isnull().sum()
print (count)
2
If NaN
s are strings
:
如果NaN
s 是strings
:
enron_df = pd.DataFrame({'salary':['NaN', 'NaN', 1, 5, 'NaN']})
print (enron_df)
salary
0 NaN
1 NaN
2 1
3 5
4 NaN
count = enron_df['salary'].eq('NaN').sum()
#alternative
#count = (enron_df['salary'] == 'NaN').sum()
print (count)
3
回答by rafaelc
By definition, count
omits NaN
s and size
does not.
根据定义,count
省略NaN
s 和size
不省略。
Thus, a simple difference should do
因此,一个简单的区别应该做
count = enron_df['salary'].size - enron_df['salary'].count()
回答by zipa
Try like this:
像这样尝试:
count = df.loc[df['salary']=='NaN'].shape[0]
Or maybe better:
或者也许更好:
count = df.loc[df['salary']=='NaN', 'salary'].size
And, going down your path, you'd need something like this:
而且,沿着你的道路,你需要这样的东西:
count = df.loc[:, 'salary'].str.count('NaN').sum()
回答by ALollz
There's also value counts with the dropna
argument
dropna
参数也有值计数
import numpy as np
import pandas as pd
enron_df = pd.DataFrame({'salary':[np.nan, np.nan, 1, 5, 7]})
enron_df.salary.value_counts(dropna=False)
#NaN 2
# 7.0 1
# 5.0 1
# 1.0 1
#Name: salary, dtype: int64
And if you just want the number, just select np.NaN
from value counts. (If they are strings 'NaN'
, then just replace np.NaN
with 'NaN'
)
如果您只想要数字,只需np.NaN
从值计数中进行选择。(如果它们是 strings 'NaN'
,那么只需替换np.NaN
为'NaN'
)
enron_df.salary.value_counts(dropna=False)[np.NaN]
#2