在Dataframe python的列中使用NaT过滤所有行

Question

提问by Jase Villam

I have a df like this:

我有一个这样的 df：

    a b           c
    1 NaT         w
    2 2014-02-01  g
    3 NaT         x   

    df=df[df.b=='2014-02-01']

will give me

会给我

    a  b          c
    2 2014-02-01  g

I want a database of all rows with NaT in column b?

我想要一个在 b 列中包含 NaT 的所有行的数据库？

   df=df[df.b==None] #Doesn't work

I want this:

我要这个：

    a b           c
    1 NaT         w
    3 NaT         x

Answer 1

采纳答案by Karl D.

isnulland notnullwork with NaTso you can handle them much the same way you handle NaNs:

isnull并notnull与之合作，NaT以便您可以像处理它们一样处理它们NaNs：

>>> df

   a          b  c
0  1        NaT  w
1  2 2014-02-01  g
2  3        NaT  x

>>> df.dtypes

a             int64
b    datetime64[ns]
c            object

just use isnullto select:

只需用于isnull选择：

df[df.b.isnull()]

   a   b  c
0  1 NaT  w
2  3 NaT  x

Answer 2

回答by Radu

Using your example dataframe:

使用您的示例数据框：

df = pd.DataFrame({"a":[1,2,3], 
                   "b":[pd.NaT, pd.to_datetime("2014-02-01"), pd.NaT], 
                   "c":["w", "g", "x"]})

Until v0.17 this didn't use to work:

在 v0.17 之前，这不能正常工作：

df.query('b != b')

and you had to do:

你必须这样做：

df.query('b == "NaT"') # yes, surprisingly, this works!

Since v0.17 though, both methods work, although I would only recommend the first one.

不过，从 v0.17 开始，这两种方法都有效，尽管我只推荐第一种。

Answer 3

回答by Eelco van Vliet

For those interested, in my case I wanted to drop the NaT contained in the DateTimeIndex of a dataframe. I could not directly use the notnull construction as suggested by Karl D. You first have to create a temporary column out of the index, then apply the mask, and then delete the temporary column again.

对于那些感兴趣的人，就我而言，我想删除数据帧的 DateTimeIndex 中包含的 NaT。我不能直接使用 Karl D 建议的 notnull 构造。您首先必须从索引中创建一个临时列，然后应用掩码，然后再次删除临时列。

df["TMP"] = df.index.values                # index is a DateTimeIndex
df = df[df.TMP.notnull()]                  # remove all NaT values
df.drop(["TMP"], axis=1, inplace=True)     # delete TMP again

Answer 4

回答by Michael Dorner

I feel that the comment by @DSM is worth a answer on its own, because this answers the fundamental question.

我觉得@DSM 的评论本身就值得一个答案，因为这回答了基本问题。

The misunderstanding comes from the assumption that pd.NaTacts like None. However, while None == Nonereturns True, pd.NaT == pd.NaTreturns False. Pandas NaTbehaves like a floating-point NaN, which is not equal to itself.

误解来自于pd.NaT行为类似于的假设None。然而，虽然None == None返回True，pd.NaT == pd.NaT返回False。Pandas 的NaT行为就像一个浮点数NaN，它不等于自身。

As the previous answer explain, you should use

正如前面的答案所解释的那样，您应该使用

df[df.b.isnull()] # or notnull(), respectively

在Dataframe python的列中使用NaT过滤所有行

提问by Jase Villam

采纳答案by Karl D.

回答by Radu

回答by Eelco van Vliet

回答by Michael Dorner

相关推荐

最近更新

标签

在Dataframe python的列中使用NaT过滤所有行

提问by Jase Villam

采纳答案by Karl D.

回答by Radu

回答by Eelco van Vliet

回答by Michael Dorner

相关推荐

python中的漂亮打印json（pythonic方式）

Python 请求：如何禁用/绕过代理

如何在python中围绕感兴趣的区域绘制矩形

如何在 Python 中乘以小数

相关推荐

最近更新

标签