Python 如何检查pandas DataFrame中的特定单元格是否为空?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/42921854/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to check if a particular cell in pandas DataFrame isnull?
提问by Newskooler
I have the following df
in pandas.
我df
在熊猫中有以下内容。
0 A B C
1 2 NaN 8
How can I check if df.iloc[1]['B']
is NaN?
如何检查是否df.iloc[1]['B']
为 NaN?
I tried using df.isnan()
and I get a table like this:
我尝试使用df.isnan()
,我得到一个这样的表:
0 A B C
1 false true false
but I am not sure how to index the table and if this is an efficient way of performing the job at all?
但我不确定如何索引表,这是否是执行工作的有效方式?
回答by jezrael
回答by ankur09011
jezrael response is spot on. If you are only concern with NaN value, I was exploring to see if there's a faster option, since in my experience, summing flat arrays is (strangely) faster than counting. This code seems faster:
jezrael 的反应是正确的。如果您只关心 NaN 值,我正在探索是否有更快的选择,因为根据我的经验,对平面数组求和(奇怪地)比计数快。这段代码看起来更快:
df.isnull().values.any()
For example:
例如:
In [2]: df = pd.DataFrame(np.random.randn(1000,1000))
In [3]: df[df > 0.9] = pd.np.nan
In [4]: %timeit df.isnull().any().any()
100 loops, best of 3: 14.7 ms per loop
In [5]: %timeit df.isnull().values.sum()
100 loops, best of 3: 2.15 ms per loop
In [6]: %timeit df.isnull().sum().sum()
100 loops, best of 3: 18 ms per loop
In [7]: %timeit df.isnull().values.any()
1000 loops, best of 3: 948 μs per loop
回答by Loochie
If you are looking for the indexes of NaN in a specific column you can use
如果您要在特定列中查找 NaN 的索引,您可以使用
list(df['B'].index[df['B'].apply(np.isnan)])
In case you what to get the indexes of all possible NaN values in the dataframe you may do the following
如果您要获取数据框中所有可能的 NaN 值的索引,您可以执行以下操作
row_col_indexes = list(map(list, np.where(np.isnan(np.array(df)))))
indexes = []
for i in zip(row_col_indexes[0], row_col_indexes[1]):
indexes.append(list(i))
And if you are looking for a one liner you can use:
如果您正在寻找单衬里,您可以使用:
list(zip(*[x for x in list(map(list, np.where(np.isnan(np.array(df)))))]))