Pandas - 根据日期时间列值删除 DataFrame 行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/36841377/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas - Dropping DataFrame rows based on Datetime column value
提问by DiamondDogs95
I am currently writing a script where I want to drop some rows of my pandas dataframe according to Datetime values over several years (I want to drop rows where datetime is between February and May. So, I first tried the following code:
我目前正在编写一个脚本,我想根据几年的日期时间值删除我的 Pandas 数据帧的一些行(我想删除日期时间在二月和五月之间的行。所以,我首先尝试了以下代码:
game_df['Date'] = game_df[(game_df['Date'].dt.month < 2) & (game_df['Date'].dt.month > 5)]
It gave me the same dataframe with NaN values in the 'Date' column over this period of time. So I tried the following code in order to drop the corresponding rows:
在这段时间内,它在“日期”列中为我提供了相同的数据框,其中包含 NaN 值。所以我尝试了以下代码以删除相应的行:
game_df['Date'] = game_df[(game_df['Date'].dt.month < 2) & (game_df['Date'].dt.month > 5)].drop(game_df.columns)
But it raised an error like: labels [u'Date' u'other_column1' u'other_column2' u'other_column3' u'other_column4'] not contained in axis
但它引发了一个错误,如:标签 [u' Date' u' other_column1' u' other_column2' u' other_column3' u' other_column4'] 未包含在轴中
Does anyone can solve this problem?
有没有人可以解决这个问题?
回答by Jarad
I think you could try something like this using a list of Timestamp
s:
我认为您可以使用Timestamp
s列表尝试类似的操作:
If you want to exclude rows with specific dates:
如果要排除具有特定日期的行:
game_df[~game_df['Date'].isin([pd.Timestamp('20150210'), pd.Timestamp('20150301')])]
The ~
is a not
operator at the beginning of game_df
in case you're not familiar with it. So it's saying to return the dataframe where the timestamps are not the two dates mentioned.
该~
是not
在开始操作game_df
的情况下,你不熟悉它。所以它说要返回时间戳不是提到的两个日期的数据帧。
Edit: If you want to exclude a rangeof rows between specific dates:
编辑:如果要排除特定日期之间的一系列行:
game_df[~game_df['Date'].isin(pd.date_range(start='20150210', end='20150301'))]
回答by DiamondDogs95
Actually, I've found what I was looking for with the following code:
实际上,我使用以下代码找到了我想要的东西:
game_df = game_df[(game_df['Date'].dt.month != 2) & (game_df['Date'].dt.month != 3) & (game_df['Date'].dt.month != 4)\
& (game_df['Date'].dt.month != 5)]
It is pretty ugly and I truly think it can be done with a more efficient way but it works when it comes to exclude rows whose datetime values are located in a span of time.
它非常丑陋,我真的认为它可以用更有效的方式来完成,但它在排除日期时间值位于一段时间内的行时有效。
回答by Kolom
Instead of dropping, I find query much more helpful. But you need to change arguments of course to include part of the data you want to keep.
我发现查询更有帮助,而不是丢弃。但是您当然需要更改参数以包含您想要保留的部分数据。
df.query("Date.dt.month < 2 & Date.dt.month > 5", inplace=True)
if you want to use exact dates:
如果你想使用确切的日期:
df.query("Date <= '2017-01-31' & Date >= '2017-05-01' ", inplace=True)