pandas 类型错误:无法将“时间戳”类型与“日期”类型进行比较

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/51474263/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:49:42  来源:igfitidea点击:

TypeError: Cannot compare type 'Timestamp' with type 'date'

pythonpandasdatetime

提问by Pherdindy

The problem is in line 22:

问题在第 22 行

if start_date <= data_entries.iloc[j, 1] <= end_date:

if start_date <= data_entries.iloc[j, 1] <= end_date:

where I want to compare the start_dateand end_dateportion to data_entries.iloc[j, 1]which is accessing a column of the pandas dataframe. I converted the column to datetimeusing,

我想比较访问Pandas数据框列的start_dateend_date部分data_entries.iloc[j, 1]。我使用将列转换为日期时间

data_entries['VOUCHER DATE'] = pd.to_datetime(data_entries['VOUCHER DATE'], format="%m/%d/%Y")

data_entries['VOUCHER DATE'] = pd.to_datetime(data_entries['VOUCHER DATE'], format="%m/%d/%Y")

But I am unsure how to convert it to date.

但我不确定如何将其转换为date

import pandas as pd
import datetime

entries_csv = "C:\Users\Pops\Desktop\Entries.csv"

data_entries = pd.read_csv(entries_csv)
data_entries['VOUCHER DATE'] = pd.to_datetime(data_entries['VOUCHER DATE'], format="%m/%d/%Y")

start_date = datetime.date(2018, 4, 1)
end_date = datetime.date(2018, 10, 30)

    for j in range(0, len(data_entries)):
        if start_date <= data_entries.iloc[j, 1] <= end_date:
             print('Hello')

采纳答案by moshevi

this converts it to date:

这将其转换为日期:

data_entries['VOUCHER DATE'] = pd.to_datetime(data_entries['VOUCHER DATE'], format="%m/%d/%Y").dt.date

however i would not recommend filtering like this. this is much faster

但是我不建议像这样过滤。这要快得多

data_entries[data_entries['VOUCHER DATE'].between(start_date, end_date)]

read this article

阅读这篇文章

回答by jpp

Just use pd.Timestampobjects without any conversion:

只需使用pd.Timestamp对象而不进行任何转换:

start_date = pd.Timestamp('2018-04-01')
end_date = pd.Timestamp('2018-10-30')

res = data_entries[data_entries['VOUCHER DATE'].between(start_date, end_date)]

Explanation

解释

Don't use datetime.datetimeor datetime.dateobjects in Pandas series. This is inefficient because you lose vectorised functionality. The benefit of pd.Timestampobjects is you can utilize vectorised functionality for calculations. As described here:

不要在 Pandas 系列中使用datetime.datetimedatetime.date对象。这是低效的,因为您失去了矢量化功能。pd.Timestamp对象的好处是您可以利用矢量化功能进行计算。如上所述这里

numpy.datetime64is essentially a thin wrapper an int64. It has almost no date/time specific functionality.

pd.Timestampis a wrapper around a numpy.datetime64. It is backed by the same int64 value, but supports the entire datetime.datetimeinterface, along with useful pandas-specific functionality.

numpy.datetime64本质上是一个 int64 的薄包装器。它几乎没有特定于日期/时间的功能。

pd.Timestamp是围绕 numpy.datetime64 的包装。它由相同的 int64 值支持,但支持整个datetime.datetime接口,以及有用的 Pandas 特定功能。