pandas 根据特定列或列中是否存在空值从 DataFrame 中选择行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/36820549/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Select rows from a DataFrame based on presence of null value in specific column or columns
提问by mapping dom
I have an imported xls file as pandas dataframe, there are two columns containing coordinates which i will use to merge the dataframe with others which have geolocation data. df.info() shows 8859 records, the coordinatess columns have '8835 non-null float64' records.
我有一个导入的 xls 文件作为 Pandas 数据框,有两列包含坐标,我将使用这些坐标将数据框与其他具有地理位置数据的数据框合并。df.info() 显示 8859 条记录,坐标列有 '8835 条非空 float64' 记录。
I want to eyeball the 24 rows (that i assume are null) with all columns records to see if one of the other columns (street address town) can't be used to manually add back the coordinates for those 24 records. Ie. return dataframe for column in df.['Easting'] where isnull or NaN
我想用所有列记录查看 24 行(我假设为空),以查看其他列之一(街道地址城镇)是否不能用于手动添加这 24 条记录的坐标。IE。为 df.['Easting'] 中的列返回数据框,其中 isnull 或 NaN
I have adapted the method given hereas below;
我已经调整了这里给出的方法如下;
df.loc[df['Easting'] == NaN]
But get back an empty dataframe (0 rows × 24 columns), which makes no sense (to me). Attempting to use Null or Non null doesn't work as these values aren't defined. What am i missing?
但是取回一个空的数据框(0 行 × 24 列),这没有意义(对我来说)。尝试使用 Null 或 Non null 不起作用,因为这些值未定义。我错过了什么?
回答by jezrael
I think you need isnull
for checking NaN
values with boolean indexing
:
我认为您需要使用isnull
以下方法检查NaN
值boolean indexing
:
df[df['Easting'].isnull()]
Docs:
文档:
Warning
One has to be mindful that in python (and numpy), the nan's don't compare equal, but None's do. Note that Pandas/numpy uses the fact that np.nan != np.nan, and treats None like np.nan.
警告
必须注意,在 python(和 numpy)中,nan 不相等,但 None 不相等。请注意,Pandas/numpy 使用 np.nan != np.nan 的事实,并将 None 视为 np.nan。
In [11]: None == None
Out[11]: True
In [12]: np.nan == np.nan
Out[12]: False
So as compared to above, a scalar equality comparison versus a None/np.nan doesn't provide useful information.
因此,与上面相比,标量相等比较与 None/np.nan 没有提供有用的信息。
In [13]: df2['one'] == np.nan
Out[13]:
a False
b False
c False
d False
e False
f False
g False
h False
Name: one, dtype: bool