pandas python pandas中两个datetime.time列之间的微秒差异?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 
原文地址: http://stackoverflow.com/questions/22513306/
Warning: these are provided under cc-by-sa 4.0 license.  You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Microsecond difference between two datetime.time columns in python pandas?
提问by James Bond
I have a python pandas data frame, which contains 2 columns: time1and time2:
我有一个 python pandas 数据框,它包含 2 列:time1和time2:
     time1             time2
13:00:07.294234    13:00:07.294234 
14:00:07.294234    14:00:07.394234 
15:00:07.294234    15:00:07.494234 
16:00:07.294234    16:00:07.694234 
How can I generate a third column which contains the microsecond difference between time1and time2, in integer if possible?
如果可能,我如何生成包含time1和之间微秒差异的第三列time2,以整数表示?
回答by Andy Hayden
If you prepend hese with an actual date you can convert them to datetime64 columns:
如果您在前面加上实际日期,您可以将它们转换为 datetime64 列:
In [11]: '2014-03-19 ' + df
Out[11]: 
                        time1                       time2
0  2014-03-19 13:00:07.294234  2014-03-19 13:00:07.294234
1  2014-03-19 14:00:07.294234  2014-03-19 14:00:07.394234
2  2014-03-19 15:00:07.294234  2014-03-19 15:00:07.494234
3  2014-03-19 16:00:07.294234  2014-03-19 16:00:07.694234
[4 rows x 2 columns]
In [12]: df = ('2014-03-19 ' + df).astype('datetime64[ns]')
Out[12]: 
                       time1                      time2
0 2014-03-19 20:00:07.294234 2014-03-19 20:00:07.294234
1 2014-03-19 21:00:07.294234 2014-03-19 21:00:07.394234
2 2014-03-19 22:00:07.294234 2014-03-19 22:00:07.494234
3 2014-03-19 23:00:07.294234 2014-03-19 23:00:07.694234
Now you can subtract these columns:
现在您可以减去这些列:
In [13]: delta = df['time2'] - df['time1']
In [14]: delta
Out[14]: 
0          00:00:00
1   00:00:00.100000
2   00:00:00.200000
3   00:00:00.400000
dtype: timedelta64[ns]
To get the number of microseconds, just divide the underlying nanoseconds by 1000:
要获得微秒数,只需将底层纳秒除以 1000:
In [15]: t.astype(np.int64) / 10**3
Out[15]: 
0         0
1    100000
2    200000
3    400000
dtype: int64
As Jeff points out, on recent versions of numpy you can divide by 1 micro second:
正如杰夫指出的那样,在最近版本的 numpy 上,您可以除以 1 微秒:
In [16]: t / np.timedelta64(1,'us')
Out[16]: 
0         0
1    100000
2    200000
3    400000
dtype: float64
回答by acushner
the easiest way is just to do this:
最简单的方法就是这样做:
(pd.to_datetime(df['time2']) - pd.to_datetime(df['time1'])) / np.timedelta64(1, 'us')'
(pd.to_datetime(df['time2']) - pd.to_datetime(df['time1'])) / np.timedelta64(1, 'us')'
回答by firelynx
At first I thought there was no correct answers here due to no green ticks. But as pointed out by Jeff in the comments, I was wrong.
起初我以为这里没有正确的答案,因为没有绿色的勾。但正如杰夫在评论中指出的那样,我错了。
Either way here is my contribution.
无论哪种方式,这里都是我的贡献。
First, the obvious, making the datetime.timeinto a timedelta
首先,显而易见,使之datetime.time成为一个timedelta
df['delta'] = (pd.to_timedelta(df.time2.astype(str)) - pd.to_timedelta(df.time1.astype(str)))
             time1            time2           delta
0  13:00:07.294234  13:00:07.294234        00:00:00
1  14:00:07.294234  14:00:07.394234 00:00:00.100000
2  15:00:07.294234  15:00:07.494234 00:00:00.200000
3  16:00:07.294234  16:00:07.694234 00:00:00.400000
Now that we have the timedeltawe can simply divide it by one microsecond to get the number of microseconds.
现在我们有了 ,timedelta我们可以简单地将它除以一微秒以获得微秒数。
df['microsecond_delta'] = df.delta / pd.np.timedelta64(1, 'us')
             time1            time2           delta  microsecond_delta
0  13:00:07.294234  13:00:07.294234        00:00:00                  0
1  14:00:07.294234  14:00:07.394234 00:00:00.100000             100000
2  15:00:07.294234  15:00:07.494234 00:00:00.200000             200000
3  16:00:07.294234  16:00:07.694234 00:00:00.400000             400000
I have to add that this is very counter intuitive, but it seems it is the only way. There seem to be no way of accessing the milliseconds directly. I tried via applying lambda functions like:
我必须补充一点,这是非常违反直觉的,但似乎这是唯一的方法。似乎没有办法直接访问毫秒。我尝试通过应用 lambda 函数,例如:
df.delta.apply(lambda x: x.microseconds)
AttributeError: 'numpy.timedelta64' object has no attribute 'microseconds'
Same is true for seconds, nanoseconds, millisecondsand so on...
同样是真实的seconds,nanoseconds,milliseconds等...
回答by PeterPanda
Using dateutil you could transform your timestamp columns to 'real' timestamps:
使用 dateutil 您可以将时间戳列转换为“真实”时间戳:
df.time1 = df.time1.apply(dateutil.parser.parse)
df.time2 = df.time2.apply(dateutil.parser.parse)
df.time1 = df.time1.apply(dateutil.parser.parse)
df.time2 = df.time2.apply(dateutil.parser.parse)
After that you want to define a new column like this:
之后,您要定义一个新列,如下所示:
df['delta'] = df.time2 - df.time1
df['delta'] = df.time2 - df.time1

