python pandas datetime.time - datetime.time

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/22093962/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 21:45:15  来源:igfitidea点击:

python pandas datetime.time - datetime.time

pythondatetimepandas

提问by James Bond

i have a dataframe which contains two columns of datetime.time items. something like

我有一个包含两列 datetime.time 项目的数据框。就像是

   col1                 col2
02:10:00.008209    02:08:38.053145
02:10:00.567054    02:08:38.053145
02:10:00.609842    02:08:38.053145
02:10:00.728153    02:08:38.053145
02:10:02.394408    02:08:38.053145

how can i generate a col3 which is the differences between col1 and col2? (preferablly in microseconds)?

我如何生成一个 col3,它是 col1 和 col2 之间的差异?(最好以微秒为单位)?

I searched around but I cannot find a solution here. Does anyone know?

我四处搜索,但在这里找不到解决方案。有人知道吗?

Thanks!

谢谢!

回答by HYRY

don't use datetime.time, use timedelta:

不要使用datetime.time,使用timedelta

import pandas as pd
import io
data = """col1                 col2
02:10:00.008209    02:08:38.053145
02:10:00.567054    02:08:38.053145
02:10:00.609842    02:08:38.053145
02:10:00.728153    02:08:38.053145
02:10:02.394408    02:08:38.053145"""
df = pd.read_table(io.BytesIO(data), delim_whitespace=True)
df2 = df.apply(pd.to_timedelta)
diff = df2.col1 - df2.col2

diff.astype("i8")/1e9

the output is different in seconds:

输出以秒为单位不同:

0    81.955064
1    82.513909
2    82.556697
3    82.675008
4    84.341263
dtype: float64

To convert time dataframe to timedelta dataframe:

要将时间数据帧转换为 timedelta 数据帧:

df.applymap(time.isoformat).apply(pd.to_timedelta)

回答by unutbu

Are you sure you want a DataFrame of datetime.timeobjects? There is hardly an operation you can perform conveniently on these guys especially when wrapped in a DataFrame.

你确定你想要一个datetime.time对象的 DataFrame吗?几乎没有任何操作可以方便地对这些家伙执行,尤其是在包装在 DataFrame 中时。

It might be better to have each column store an int representing the total number of microseconds.

让每列存储一个表示总微秒数的 int 可能会更好。

You can convert dfto a DataFrame storing microseconds like this:

您可以转换df为存储微秒的数据帧,如下所示:

In [71]: df2 = df.applymap(lambda x: ((x.hour*60+x.minute)*60+x.second)*10**6+x.microsecond)

In [72]: df2
Out[72]: 
         col1        col2
0  7800008209  7718053145
1  7800567054  7718053145

And from there, it is easy to get the result you desire:

从那里,很容易得到你想要的结果:

In [73]: df2['col1']-df2['col2']
Out[73]: 
0    81955064
1    82513909
dtype: int64

回答by Acorbe

pandasconverts datetimeobjects to np.datetime64objects, whose differences are np.timedelta64objects.

pandasdatetime对象转换为np.datetime64对象,它们的区别在于np.timedelta64对象。

Consider this

考虑这个

In [30]: df
Out[30]: 
                       0                          1
0 2014-02-28 13:30:19.926778 2014-02-28 13:30:47.178474
1 2014-02-28 13:30:29.814575 2014-02-28 13:30:51.183349

I can consider the column-wise difference by

我可以考虑按列的差异

 df[0] - df[1]


 Out[31]: 
 0   -00:00:27.251696
 1   -00:00:21.368774
 dtype: timedelta64[ns]

and hence I can apply timedelta64conversions. For microseconds

因此我可以应用timedelta64转换。对于微秒

(df[0] - df[1]).apply(lambda x : x.astype('timedelta64[us]')) #no actual difference when displayed

or microseconds as integers

或微秒作为整数

(df[0] - df[1]).apply(lambda x : x.astype('timedelta64[us]').astype('int'))

 0   -27251696000
 1   -21368774000
 dtype: int64

EDIT:As suggessted by @Jeff, the last expressions can be shortened as

编辑:正如@Jeff 所建议的,最后一个表达式可以缩短为

(df[0] - df[1]).astype('timedelta64[us]')

and

(df[0] - df[1]).astype('timedelta64[us]').astype('int')

for pandas >= .13.

对于Pandas >= .13。