Python Pandas - 将字符串转换为没有日期的时间

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32375471/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 11:27:45  来源:igfitidea点击:

Pandas - convert strings to time without date

pythonpandas

提问by RDJ

I've read loads of SO answers but can't find a clear solution.

我已经阅读了大量的 SO 答案,但找不到明确的解决方案。

I have this data in a df called day1which represents hours:

我在一个 df 中有这个数据,day1它代表小时:

1    10:53
2    12:17
3    14:46
4    16:36
5    18:39
6    20:31
7    22:28
Name: time, dtype: object>

I want to convert it into a time format. But when I do this:

我想将其转换为时间格式。但是当我这样做时:

day1.time = pd.to_datetime(day1.time, format='H%:M%')

day1.time = pd.to_datetime(day1.time, format='H%:M%')

The result includes today's date:

结果包括今天的日期:

1   2015-09-03 10:53:00
2   2015-09-03 12:17:00
3   2015-09-03 14:46:00
4   2015-09-03 16:36:00
5   2015-09-03 18:39:00
6   2015-09-03 20:31:00
7   2015-09-03 22:28:00
Name: time, dtype: datetime64[ns]>

It seems the formatargument isn't working - how do I get the time as shown here without the date?

似乎这个format论点不起作用 - 我如何在没有日期的情况下获得此处显示的时间?



Update

更新

The following formats the time correctly, but somehow the column is still an object type. Why doesn't it convert to datetime64?

以下正确格式化时间,但不知何故该列仍然是对象类型。为什么不转换为datetime64

day1['time'] = pd.to_datetime(day1['time'], format='%H:%M').dt.time

day1['time'] = pd.to_datetime(day1['time'], format='%H:%M').dt.time

1    10:53:00
2    12:17:00
3    14:46:00
4    16:36:00
5    18:39:00
6    20:31:00
7    22:28:00
Name: time, dtype: object>

采纳答案by EdChum

After performing the conversion you can use the datetime accessor dtto access just the houror timecomponent:

执行转换后,您可以使用 datetimedt访问器仅访问hourortime组件:

In [51]:

df['hour'] = pd.to_datetime(df['time'], format='%H:%M').dt.hour
df
Out[51]:
        time  hour
index             
1      10:53    10
2      12:17    12
3      14:46    14
4      16:36    16
5      18:39    18
6      20:31    20
7      22:28    22

Also your format string H%:M%is malformed, it's likely to raise a ValueError: ':' is a bad directive in format 'H%:M%'

此外,您的格式字符串H%:M%格式错误,很可能会引发ValueError: ':' is a bad directive in format 'H%:M%'

Regarding your last comment the dtype is datetime.timenot datetime:

关于您的最后一条评论,dtypedatetime.time不是datetime

In [53]:
df['time'].iloc[0]

Out[53]:
datetime.time(10, 53)

回答by YOBEN_S

You can use to_timedelta

您可以使用 to_timedelta

pd.to_timedelta(df+':00')
Out[353]: 
1   10:53:00
2   12:17:00
3   14:46:00
4   16:36:00
5   18:39:00
6   20:31:00
7   22:28:00
Name: Time, dtype: timedelta64[ns]

回答by Bowen Liu

I recently also struggled with this problem. My method is close to EdChum's method and the result is the same as YOBEN_S's answer.

我最近也在努力解决这个问题。我的方法和EdChum的方法很接近,结果和YOBEN_S的回答一样。

Just like EdChum illustrated, using dt.houror dt.timewill give you a datetime.time object, which is probably only good for display. I can barely do any comparison or calculation on these objects. So if you need any further comparison or calculation operations on the result columns, it's better to avoid such data formats.

就像 EdChum 说明的那样,使用dt.hourordt.time会给你一个 datetime.time 对象,它可能只适合显示。我几乎无法对这些对象进行任何比较或计算。因此,如果您需要对结果列进行任何进一步的比较或计算操作,最好避免使用此类数据格式。

My method is just subtract the date from the to_datetimeresult:

我的方法只是从to_datetime结果中减去日期:

c = pd.Series(['10:23', '12:17', '14:46'])
pd.to_datetime(c, format='%H:%M') - pd.to_datetime(c, format='%H:%M').dt.normalize()

The result is

结果是

0   10:23:00
1   12:17:00
2   14:46:00
dtype: timedelta64[ns]

dt.normalize()basically sets all time component to 00:00:00, and it will only display the date while keeping the datetime64data format, thereby making it possible to do calculations with it.

dt.normalize()基本上将所有时间分量设置为00:00:00,它只会在保持datetime64数据格式的情况下显示日期,从而可以用它进行计算。

My answer is by no means better than the other two. I just want to provide a different approach and hope it helps.

我的答案绝不比其他两个好。我只是想提供一种不同的方法,希望它有所帮助。