pandas 如何将小时添加到熊猫数据框列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/35032135/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:34:20  来源:igfitidea点击:

how to add hour to pandas dataframe column

pythonpandasdatetimedataframe

提问by Neil

I have a pandas dataframe time column like following.

我有一个如下所示的Pandas数据帧时间列。

 segments_data['time']
 Out[1585]: 
 0      04:50:00
 1      04:50:00
 2      05:00:00
 3      05:12:00
 4      06:04:00
 5      06:44:00
 6      06:44:00
 7      06:47:00
 8      06:47:00
 9      06:47:00

I want to add 5 hours and 30 mins to above time column. I am doing following in python.

我想在上面的时间列中添加 5 小时 30 分钟。我正在用 python 跟踪。

pd.DatetimeIndex(segments_data['time']) + pd.DateOffset(hours=5,minutes=30)

But it gives me an error.

但它给了我一个错误。

TypeError: object of type 'datetime.time' has no len()

please help.

请帮忙。

采纳答案by EdChum

This is a gnarly way of doing it, principally the problem here is the lack of vectorised support for timeobjects, so you first need to convert the timeto datetimeby using combineand then apply the offset and get the timecomponent back:

这是做这件事的粗糙方式,主要是这里的问题是缺乏对矢量化支持time的对象,所以你首先需要转换timedatetime使用combine,然后应用偏移,并获得time部分回:

In [28]:  
import datetime as dt  
df['new_time'] = df['time'].apply(lambda x: (dt.datetime.combine(dt.datetime(1,1,1), x,) + dt.timedelta(hours=3,minutes=30)).time())
df

Out[28]:
           time  new_time
index                    
0      04:50:00  08:20:00
1      04:50:00  08:20:00
2      05:00:00  08:30:00
3      05:12:00  08:42:00
4      06:04:00  09:34:00
5      06:44:00  10:14:00
6      06:44:00  10:14:00
7      06:47:00  10:17:00
8      06:47:00  10:17:00
9      06:47:00  10:17:00

回答by Fabio Lamanna

You can try importing timedelta:

您可以尝试导入timedelta

from datetime import datetime, timedelta

and then:

进而:

segments_data['time'] = pd.DatetimeIndex(segments_data['time']) + timedelta(hours=5,minutes=30)

回答by jpp

Pandas does not support vectorised operations with datetime.timeobjects. For efficient, vectorised operations, there is no requirement to use the datetimemodule from the standard library.

Pandas 不支持对datetime.time对象进行矢量化操作。对于高效的矢量化操作,不需要使用datetime标准库中的模块。

You have a couple of options to vectorise your calculation. Either use a Pandastimedeltaseries, if your times represent a duration. Or use a Pandasdatetimeseries, if your times represent specific points in time.

您有几个选项可以矢量化您的计算。如果您的时间代表持续时间,请使用Pandastimedelta系列。或者使用Pandasdatetime系列,如果您的时间代表特定的时间点。

The choice depends entirely on what your data represents.

选择完全取决于您的数据代表什么。

timedeltaseries

timedelta系列

df['time'] = pd.to_timedelta(df['time'].astype(str)) + pd.to_timedelta('05:30:00')

print(df['time'].head())

0   10:20:00
1   10:20:00
2   10:30:00
3   10:42:00
4   11:34:00
Name: 1, dtype: timedelta64[ns]

datetimeseries

datetime系列

df['time'] = pd.to_datetime(df['time'].astype(str)) + pd.DateOffset(hours=5, minutes=30)

print(df['time'].head())

0   2018-12-24 10:20:00
1   2018-12-24 10:20:00
2   2018-12-24 10:30:00
3   2018-12-24 10:42:00
4   2018-12-24 11:34:00
Name: 1, dtype: datetime64[ns]

Notice by default the currentdate is assumed.

请注意,默认情况下假定当前日期。

回答by Tom Wattley

as of '0.25.3' this is as simple as

从 '0.25.3' 开始,这很简单

df[column] = df[column] + pd.Timedelta(hours=1)