pandas 如何计算两个熊猫列之间的时间差

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/51381290/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:48:48  来源:igfitidea点击:

How to calculate time difference between two pandas column

pythonpandasdataframedata-analysis

提问by pyd

My df looks like,

我的 df 看起来像,

    start               stop
0   2015-11-04 10:12:00 2015-11-06 06:38:00
1   2015-11-04 10:23:00 2015-11-05 08:30:00
2   2015-11-04 14:01:00 2015-11-17 10:34:00
4   2015-11-19 01:43:00 2015-12-21 09:04:00

print(time_df.dtypes)

start       datetime64[ns]
stop        datetime64[ns]

dtype: object

数据类型:对象

I am trying to find the time difference between, stop and start.

我试图找到停止和开始之间的时差。

I tried, pd.Timedelta(df_time['stop']-df_time['start'])but it gives TypeError: data type "datetime" not understood

我试过了,pd.Timedelta(df_time['stop']-df_time['start'])但它给了TypeError: data type "datetime" not understood

df_time['stop']-df_time['start']also gives same error.

df_time['stop']-df_time['start']也给出同样的错误。

My expected output,

我的预期输出,

 2D,?H
 1D,?H
 ...
 ...

回答by jezrael

You need omit pd.Timedelta, because difference of times return timedeltas:

您需要 omit pd.Timedelta,因为时间差异返回 timedeltas:

df_time['td'] = df_time['stop']-df_time['start']
print (df_time)
                start                stop               td
0 2015-11-04 10:12:00 2015-11-06 06:38:00  1 days 20:26:00
1 2015-11-04 10:23:00 2015-11-05 08:30:00  0 days 22:07:00
2 2015-11-04 14:01:00 2015-11-17 10:34:00 12 days 20:33:00

EDIT: Another solution is subtract numpy arrays:

编辑:另一个解决方案是减去 numpy 数组:

df_time['td'] = df_time['stop'].values - df_time['start'].values
print (df_time)
                start                stop               td
0 2015-11-04 10:12:00 2015-11-06 06:38:00  1 days 20:26:00
1 2015-11-04 10:23:00 2015-11-05 08:30:00  0 days 22:07:00
2 2015-11-04 14:01:00 2015-11-17 10:34:00 12 days 20:33:00

回答by Treizh

First make sure that you have dates in your column

首先确保您的列中有日期

data.loc[:, 'start'] = pd.to_datetime(data.loc[:, 'start'])
data.loc[:, 'stop'] = pd.to_datetime(data.loc[:, 'stop'])

Then substract

然后减去

data['delta'] = data['stop'] - data['start']