pandas 级别 NaN 必须与名称相同

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/49818031/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:27:58  来源:igfitidea点击:

Level NaN must be same as name

pythonpandasdataframecountnan

提问by Ian Dzindo

I am trying to count how many times NaN appears in a column of a dataframe using this code:

我正在尝试使用以下代码计算 NaN 在数据帧的列中出现的次数:

count = enron_df.loc['salary'].count('NaN')

But every time i run this i get the following error:

但是每次我运行它时,我都会收到以下错误:

KeyError: 'Level NaN must be same as name (None)'

I searched around the web a lot trying to find a solution, but to no avail.

我在网上搜索了很多试图找到解决方案,但无济于事。

回答by jezrael

If NaNs are missing values:

如果NaNs 是缺失值

enron_df = pd.DataFrame({'salary':[np.nan, np.nan, 1, 5, 7]})
print (enron_df)
   salary
0     NaN
1     NaN
2     1.0
3     5.0
4     7.0

count = enron_df['salary'].isna().sum()
#alternative
#count = enron_df['salary'].isnull().sum()
print (count)
2

If NaNs are strings:

如果NaNs 是strings

enron_df = pd.DataFrame({'salary':['NaN', 'NaN', 1, 5, 'NaN']})
print (enron_df)
  salary
0    NaN
1    NaN
2      1
3      5
4    NaN

count = enron_df['salary'].eq('NaN').sum()
#alternative
#count = (enron_df['salary'] == 'NaN').sum()
print (count)
3

回答by rafaelc

By definition, countomits NaNs and sizedoes not.

根据定义,count省略NaNs 和size不省略。

Thus, a simple difference should do

因此,一个简单的区别应该做

count = enron_df['salary'].size - enron_df['salary'].count()

回答by zipa

Try like this:

像这样尝试:

count = df.loc[df['salary']=='NaN'].shape[0]

Or maybe better:

或者也许更好:

count = df.loc[df['salary']=='NaN', 'salary'].size

And, going down your path, you'd need something like this:

而且,沿着你的道路,你需要这样的东西:

count = df.loc[:, 'salary'].str.count('NaN').sum()

回答by ALollz

There's also value counts with the dropnaargument

dropna参数也有值计数

import numpy as np
import pandas as pd

enron_df = pd.DataFrame({'salary':[np.nan, np.nan, 1, 5, 7]})

enron_df.salary.value_counts(dropna=False)
#NaN     2
# 7.0    1
# 5.0    1
# 1.0    1
#Name: salary, dtype: int64

And if you just want the number, just select np.NaNfrom value counts. (If they are strings 'NaN', then just replace np.NaNwith 'NaN')

如果您只想要数字,只需np.NaN从值计数中进行选择。(如果它们是 strings 'NaN',那么只需替换np.NaN'NaN'

enron_df.salary.value_counts(dropna=False)[np.NaN]
#2