Pandas - 根据日期时间列值删除 DataFrame 行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/36841377/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:07:19  来源:igfitidea点击:

Pandas - Dropping DataFrame rows based on Datetime column value

pythondatetimepandas

提问by DiamondDogs95

I am currently writing a script where I want to drop some rows of my pandas dataframe according to Datetime values over several years (I want to drop rows where datetime is between February and May. So, I first tried the following code:

我目前正在编写一个脚本,我想根据几年的日期时间值删除我的 Pandas 数据帧的一些行(我想删除日期时间在二月和五月之间的行。所以,我首先尝试了以下代码:

game_df['Date'] = game_df[(game_df['Date'].dt.month < 2) & (game_df['Date'].dt.month > 5)]

It gave me the same dataframe with NaN values in the 'Date' column over this period of time. So I tried the following code in order to drop the corresponding rows:

在这段时间内,它在“日期”列中为我提供了相同的数据框,其中包含 NaN 值。所以我尝试了以下代码以删除相应的行:

game_df['Date'] = game_df[(game_df['Date'].dt.month < 2) & (game_df['Date'].dt.month > 5)].drop(game_df.columns)

But it raised an error like: labels [u'Date' u'other_column1' u'other_column2' u'other_column3' u'other_column4'] not contained in axis

但它引发了一个错误,如:标签 [u' Date' u' other_column1' u' other_column2' u' other_column3' u' other_column4'] 未包含在轴中

Does anyone can solve this problem?

有没有人可以解决这个问题?

回答by Jarad

I think you could try something like this using a list of Timestamps:

我认为您可以使用Timestamps列表尝试类似的操作:

If you want to exclude rows with specific dates:

如果要排除具有特定日期的行:

game_df[~game_df['Date'].isin([pd.Timestamp('20150210'), pd.Timestamp('20150301')])]

The ~is a notoperator at the beginning of game_dfin case you're not familiar with it. So it's saying to return the dataframe where the timestamps are not the two dates mentioned.

~not在开始操作game_df的情况下,你不熟悉它。所以它说要返回时间戳不是提到的两个日期的数据帧。

Edit: If you want to exclude a rangeof rows between specific dates:

编辑:如果要排除特定日期之间的一系列行:

game_df[~game_df['Date'].isin(pd.date_range(start='20150210', end='20150301'))]

回答by DiamondDogs95

Actually, I've found what I was looking for with the following code:

实际上,我使用以下代码找到了我想要的东西:

game_df = game_df[(game_df['Date'].dt.month != 2) & (game_df['Date'].dt.month != 3) & (game_df['Date'].dt.month != 4)\
                      & (game_df['Date'].dt.month != 5)]

It is pretty ugly and I truly think it can be done with a more efficient way but it works when it comes to exclude rows whose datetime values are located in a span of time.

它非常丑陋,我真的认为它可以用更有效的方式来完成,但它在排除日期时间值位于一段时间内的行时有效。

回答by Kolom

Instead of dropping, I find query much more helpful. But you need to change arguments of course to include part of the data you want to keep.

我发现查询更有帮助,而不是丢弃。但是您当然需要更改参数以包含您想要保留的部分数据。

df.query("Date.dt.month < 2 & Date.dt.month > 5", inplace=True)

if you want to use exact dates:

如果你想使用确切的日期:

df.query("Date <= '2017-01-31' & Date >= '2017-05-01' ", inplace=True)