在 Pandas 数据框布尔索引中使用“相反布尔值”的正确方法
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/33512372/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Proper way to use "opposite boolean" in Pandas data frame boolean indexing
提问by Mike Williamson
I wanted to use a boolean indexing, checking for rows of my data frame where a particular column does nothave NaN
values. So, I did the following:
我想用一个布尔值索引,检查我的数据帧的行,其中特定的列并没有有NaN
值。所以,我做了以下事情:
import pandas as pd
my_df.loc[pd.isnull(my_df['col_of_interest']) == False].head()
to see a snippet of that data frame, including only the values that are not NaN
(most values are NaN
).
查看该数据框的片段,仅包括不是的值NaN
(大多数值为NaN
)。
It worked, but seems less-than-elegant. I'd want to type:
它有效,但似乎不够优雅。我想输入:
my_df.loc[!pd.isnull(my_df['col_of_interest'])].head()
However, that generated an error. I also spend a lot of time in R, so maybe I'm confusing things. In Python, I usually put in the syntax "not" where I can. For instance, if x is not none:
, but I couldn't really do it here. Is there a more elegant way? I don't like having to put in a senseless comparison.
但是,这产生了错误。我也花了很多时间在 R 上,所以也许我把事情搞糊涂了。在 Python 中,我通常在可能的地方输入“not”语法。例如,if x is not none:
,但我不能在这里真正做到。有没有更优雅的方式?我不喜欢进行毫无意义的比较。
回答by DSM
In general with pandas (and numpy), we use the bitwise NOT ~
instead of !
or not
(whose behaviour can't be overridden by types).
一般来说,对于 Pandas(和 numpy),我们使用按位 NOT~
而不是!
or not
(其行为不能被类型覆盖)。
While in this case we have notnull
, ~
can come in handy in situations where there's no special opposite method.
虽然在这种情况下,我们有notnull
,~
可以在没有特殊相反方法的情况下派上用场。
>>> df = pd.DataFrame({"a": [1, 2, np.nan, 3]})
>>> df.a.isnull()
0 False
1 False
2 True
3 False
Name: a, dtype: bool
>>> ~df.a.isnull()
0 True
1 True
2 False
3 True
Name: a, dtype: bool
>>> df.a.notnull()
0 True
1 True
2 False
3 True
Name: a, dtype: bool
(For completeness I'll note that -
, the unary negative operator, will also work on a boolean Series but ~
is the canonical choice, and -
has been deprecated for numpy boolean arrays.)
(为了完整-
起见,我会注意到,一元负运算符也适用于布尔系列,但~
它是规范选择,并且-
已被 numpy 布尔数组弃用。)
回答by Anand S Kumar
Instead of using pandas.isnull()
, you should use pandas.notnull()
to find the rows where the column has not null values. Example -
pandas.isnull()
您应该使用pandas.notnull()
来查找列不包含空值的行,而不是使用。例子 -
import pandas as pd
my_df.loc[pd.notnull(my_df['col_of_interest'])].head()
pandas.notnull()
is the boolean inverse of pandas.isnull()
, as given in the documentation -
pandas.notnull()
是 的布尔倒数pandas.isnull()
,如文档中所述 -
See also
pandas.notnull
boolean inverse of pandas.isnull
另请参见
pandas.notnull
pandas.isnull 的布尔逆