Python 熊猫过滤和比较日期
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/36104500/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas filtering and comparing dates
提问by jake wong
I have a sql file which consists of the data below which I read into pandas.
我有一个 sql 文件,其中包含我读入熊猫的数据。
df = pandas.read_sql('Database count details', con=engine,
index_col='id', parse_dates='newest_available_date')
Output
输出
id code newest_date_available
9793708 3514 2015-12-24
9792282 2399 2015-12-25
9797602 7452 2015-12-25
9804367 9736 2016-01-20
9804438 9870 2016-01-20
The next line of code is to get last week's date
下一行代码是获取上周的日期
date_before = datetime.date.today() - datetime.timedelta(days=7) # Which is 2016-01-20
What I am trying to do is, to compare date_before
with df
and print out all rows that is less than date_before
我所试图做的是,比较date_before
有df
并打印出所有行小于date_before
if (df['newest_available_date'] < date_before):
print(#all rows)
if (df['newest_available_date'] < date_before):
print(#all rows)
Obviously this returns me an error The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
显然这会返回一个错误 The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
How should I do this?
我该怎么做?
回答by Fabio Lamanna
I would do a mask like:
我会做一个面具,如:
a = df[df['newest_date_available'] < date_before]
If date_before = datetime.date(2016, 1, 19)
, this returns:
如果date_before = datetime.date(2016, 1, 19)
,则返回:
id code newest_date_available
0 9793708 3514 2015-12-24
1 9792282 2399 2015-12-25
2 9797602 7452 2015-12-25
回答by onlyphantom
Using datetime.date(2019, 1, 10)
works because pandas
coerce the date to a date time under the hood.
This however, will no longer be the case in future versions of pandas
.
使用datetime.date(2019, 1, 10)
作品是因为pandas
将日期强制为引擎盖下的日期时间。但是,在未来版本的pandas
.
From version 0.24 and up, it now issue a warning:
从 0.24 及更高版本开始,它现在发出警告:
FutureWarning: Comparing Series of datetimes with 'datetime.date'. Currently, the 'datetime.date' is coerced to a datetime. In the future pandas will not coerce, and a TypeError will be raised.
FutureWarning:将日期时间系列与“datetime.date”进行比较。目前,“datetime.date”被强制为日期时间。将来熊猫不会强制,并且会引发 TypeError。
The better solution is the one proposed on its official documentationas Pandas
replacement for python datetime.datetime object.
更好的解决方案是提出了一个官方的文档作为Pandas
替代蟒蛇datetime.datetime对象。
To provide an example referencing OP's initial dataset, this is how you would use it:
为了提供一个引用 OP 的初始数据集的示例,您将如何使用它:
import pandas
cond1 = df.newest_date_available < pd.Timestamp(2016,1,10)
df.loc[cond1, ]