pandas 熊猫数据框中的日期时间不会相互减去

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/44600752/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:48:35  来源:igfitidea点击:

Datetime in pandas dataframe will not subtract from each other

pythonpandasdatetimesubtraction

提问by Graham Streich

I am trying to find the difference in times between two columns in a pandas dataframe both in datetime format.

我试图找到日期时间格式的Pandas数据框中两列之间的时间差异。

Below is some of the data in my dataframe and the code I have been using. I have triple checked that these two columns dtypes are datetime64.

下面是我的数据框中的一些数据和我一直在使用的代码。我已经三重检查这两列数据类型是 datetime64。

My data:

我的数据:

date_updated                  date_scored 
2016-03-30 08:00:00.000       2016-03-30 08:00:57.416  
2016-04-07 23:50:00.000       2016-04-07 23:50:12.036 

My code:

我的代码:

data['date_updated'] = pd.to_datetime(data['date_updated'], 
format='%Y-%m-%d %H:%M:%S')
data['date_scored'] = pd.to_datetime(data['date_scored'], 
format='%Y-%m-%d %H:%M:%S')
data['Diff'] =  data['date_updated'] - data['date_scored']

The error message I receive:

我收到的错误消息:

TypeError: data type "datetime" not understood

Any help would be appreciated, thanks!

任何帮助将不胜感激,谢谢!

My work around solution:

我的解决方案:

for i in raw_data[:10]:
scored = i.date_scored
scored_date =  pd.to_datetime(scored, format='%Y-%m-%d %H:%M:%S')
if type(scored_date) == "NoneType":
    pass
elif scored_date.year >= 2016:
    extracted = i.date_extracted
    extracted =  pd.to_datetime(extracted, format='%Y-%m-%d %H:%M:%S')
    bank = i.bank.name
    diff = scored - extracted
    datum = [str(bank), str(extracted), str(scored), str(diff)]
    data.append(datum)
else:
    pass

回答by ac2001

I encountered the same error using the above syntax (worked on another machine though):

我使用上述语法遇到了同样的错误(虽然在另一台机器上工作):

data['Diff'] =  data['date_updated'] - data['date_scored']

It worked on my new machine with:

它适用于我的新机器:

data['Diff'] =  data['date_updated'].subtract(data['date_scored'])

回答by Roberto Valerio

You need to update pandas. I've just ran into the same issue with an old code that used to run without issues. After updating pandas (0.18.1-np111py35_0) to a newer version (0.20.2-np113py35_0) the issue was resolved.

您需要更新Pandas。我刚刚遇到了同样的问题,旧代码曾经运行没有问题。将 pandas (0.18.1-np111py35_0) 更新到更新版本 (0.20.2-np113py35_0) 后,问题得到解决。

回答by Romain

It works like a charm. You can even simplify your code since to_datetimeis smart enough to guess the format for you.

它就像一个魅力。您甚至可以简化您的代码,因为to_datetime它足够聪明,可以为您猜测格式。

import io
import pandas as pd
# Paste the text by using of triple-quotes to span String literals on multiple lines
zz = """date_updated,date_scored
2016-03-30 08:00:00.000,       2016-03-30 08:00:57.416  
2016-04-07 23:50:00.000,       2016-04-07 23:50:12.036"""

data = pd.read_table(io.StringIO(zz), delim_whitespace=False, delimiter=',')

data['date_updated'] = pd.to_datetime(data['date_updated'])
data['date_scored'] = pd.to_datetime(data['date_scored'])
data['Diff'] =  data['date_updated'] - data['date_scored']

print(data)
#          date_updated             date_scored                     Diff
# 0 2016-03-30 08:00:00 2016-03-30 08:00:57.416 -1 days +23:59:02.584000
# 1 2016-04-07 23:50:00 2016-04-07 23:50:12.036 -1 days +23:59:47.964000