Python 如何检查pandas DataFrame中的特定单元格是否为空?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42921854/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 22:16:27  来源:igfitidea点击:

How to check if a particular cell in pandas DataFrame isnull?

pythonpandasdataframe

提问by Newskooler

I have the following dfin pandas.

df在熊猫中有以下内容。

0       A     B     C
1       2   NaN     8

How can I check if df.iloc[1]['B']is NaN?

如何检查是否df.iloc[1]['B']为 NaN?

I tried using df.isnan()and I get a table like this:

我尝试使用df.isnan(),我得到一个这样的表:

0       A     B      C
1   false  true  false

but I am not sure how to index the table and if this is an efficient way of performing the job at all?

但我不确定如何索引表,这是否是执行工作的有效方式?

回答by jezrael

Use pd.isnull, for select use locor iloc:

使用pd.isnull, 选择使用lociloc

print (df)
   0  A   B  C
0  1  2 NaN  8

print (df.loc[0, 'B'])
nan

a = pd.isnull(df.loc[0, 'B'])
print (a)
True

print (df['B'].iloc[0])
nan

a = pd.isnull(df['B'].iloc[0])
print (a)
True

回答by ankur09011

jezrael response is spot on. If you are only concern with NaN value, I was exploring to see if there's a faster option, since in my experience, summing flat arrays is (strangely) faster than counting. This code seems faster:

jezrael 的反应是正确的。如果您只关心 NaN 值,我正在探索是否有更快的选择,因为根据我的经验,对平面数组求和(奇怪地)比计数快。这段代码看起来更快:

df.isnull().values.any()

For example:

例如:

In [2]: df = pd.DataFrame(np.random.randn(1000,1000))

In [3]: df[df > 0.9] = pd.np.nan

In [4]: %timeit df.isnull().any().any()
100 loops, best of 3: 14.7 ms per loop

In [5]: %timeit df.isnull().values.sum()
100 loops, best of 3: 2.15 ms per loop

In [6]: %timeit df.isnull().sum().sum()
100 loops, best of 3: 18 ms per loop

In [7]: %timeit df.isnull().values.any()
1000 loops, best of 3: 948 μs per loop

回答by Loochie

If you are looking for the indexes of NaN in a specific column you can use

如果您要在特定列中查找 NaN 的索引,您可以使用

list(df['B'].index[df['B'].apply(np.isnan)])

In case you what to get the indexes of all possible NaN values in the dataframe you may do the following

如果您要获取数据框中所有可能的 NaN 值的索引,您可以执行以下操作

row_col_indexes = list(map(list, np.where(np.isnan(np.array(df)))))
indexes = []
for i in zip(row_col_indexes[0], row_col_indexes[1]):
    indexes.append(list(i))

And if you are looking for a one liner you can use:

如果您正在寻找单衬里,您可以使用:

list(zip(*[x for x in list(map(list, np.where(np.isnan(np.array(df)))))]))