pandas 类型错误:无法将“时间戳”类型与“日期”类型进行比较
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/51474263/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
TypeError: Cannot compare type 'Timestamp' with type 'date'
提问by Pherdindy
The problem is in line 22:
问题在第 22 行:
if start_date <= data_entries.iloc[j, 1] <= end_date:
if start_date <= data_entries.iloc[j, 1] <= end_date:
where I want to compare the start_date
and end_date
portion to data_entries.iloc[j, 1]
which is accessing a column of the pandas dataframe. I converted the column to datetimeusing,
我想比较访问Pandas数据框列的start_date
和end_date
部分data_entries.iloc[j, 1]
。我使用将列转换为日期时间,
data_entries['VOUCHER DATE'] = pd.to_datetime(data_entries['VOUCHER DATE'], format="%m/%d/%Y")
data_entries['VOUCHER DATE'] = pd.to_datetime(data_entries['VOUCHER DATE'], format="%m/%d/%Y")
But I am unsure how to convert it to date.
但我不确定如何将其转换为date。
import pandas as pd
import datetime
entries_csv = "C:\Users\Pops\Desktop\Entries.csv"
data_entries = pd.read_csv(entries_csv)
data_entries['VOUCHER DATE'] = pd.to_datetime(data_entries['VOUCHER DATE'], format="%m/%d/%Y")
start_date = datetime.date(2018, 4, 1)
end_date = datetime.date(2018, 10, 30)
for j in range(0, len(data_entries)):
if start_date <= data_entries.iloc[j, 1] <= end_date:
print('Hello')
采纳答案by moshevi
this converts it to date:
这将其转换为日期:
data_entries['VOUCHER DATE'] = pd.to_datetime(data_entries['VOUCHER DATE'], format="%m/%d/%Y").dt.date
however i would not recommend filtering like this. this is much faster
但是我不建议像这样过滤。这要快得多
data_entries[data_entries['VOUCHER DATE'].between(start_date, end_date)]
read this article
阅读这篇文章
回答by jpp
Just use pd.Timestamp
objects without any conversion:
只需使用pd.Timestamp
对象而不进行任何转换:
start_date = pd.Timestamp('2018-04-01')
end_date = pd.Timestamp('2018-10-30')
res = data_entries[data_entries['VOUCHER DATE'].between(start_date, end_date)]
Explanation
解释
Don't use datetime.datetime
or datetime.date
objects in Pandas series. This is inefficient because you lose vectorised functionality. The benefit of pd.Timestamp
objects is you can utilize vectorised functionality for calculations. As described here:
不要在 Pandas 系列中使用datetime.datetime
或datetime.date
对象。这是低效的,因为您失去了矢量化功能。pd.Timestamp
对象的好处是您可以利用矢量化功能进行计算。如上所述这里:
numpy.datetime64
is essentially a thin wrapper an int64. It has almost no date/time specific functionality.
pd.Timestamp
is a wrapper around a numpy.datetime64. It is backed by the same int64 value, but supports the entiredatetime.datetime
interface, along with useful pandas-specific functionality.
numpy.datetime64
本质上是一个 int64 的薄包装器。它几乎没有特定于日期/时间的功能。
pd.Timestamp
是围绕 numpy.datetime64 的包装。它由相同的 int64 值支持,但支持整个datetime.datetime
接口,以及有用的 Pandas 特定功能。