pandas python pandas中两个datetime.time列之间的微秒差异?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/22513306/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 21:49:36  来源:igfitidea点击:

Microsecond difference between two datetime.time columns in python pandas?

pythonpandas

提问by James Bond

I have a python pandas data frame, which contains 2 columns: time1and time2:

我有一个 python pandas 数据框,它包含 2 列:time1time2

     time1             time2
13:00:07.294234    13:00:07.294234 
14:00:07.294234    14:00:07.394234 
15:00:07.294234    15:00:07.494234 
16:00:07.294234    16:00:07.694234 

How can I generate a third column which contains the microsecond difference between time1and time2, in integer if possible?

如果可能,我如何生成包含time1和之间微秒差异的第三列time2,以整数表示?

回答by Andy Hayden

If you prepend hese with an actual date you can convert them to datetime64 columns:

如果您在前面加上实际日期,您可以将它们转换为 datetime64 列:

In [11]: '2014-03-19 ' + df
Out[11]: 
                        time1                       time2
0  2014-03-19 13:00:07.294234  2014-03-19 13:00:07.294234
1  2014-03-19 14:00:07.294234  2014-03-19 14:00:07.394234
2  2014-03-19 15:00:07.294234  2014-03-19 15:00:07.494234
3  2014-03-19 16:00:07.294234  2014-03-19 16:00:07.694234

[4 rows x 2 columns]

In [12]: df = ('2014-03-19 ' + df).astype('datetime64[ns]')
Out[12]: 
                       time1                      time2
0 2014-03-19 20:00:07.294234 2014-03-19 20:00:07.294234
1 2014-03-19 21:00:07.294234 2014-03-19 21:00:07.394234
2 2014-03-19 22:00:07.294234 2014-03-19 22:00:07.494234
3 2014-03-19 23:00:07.294234 2014-03-19 23:00:07.694234

Now you can subtract these columns:

现在您可以减去这些列:

In [13]: delta = df['time2'] - df['time1']

In [14]: delta
Out[14]: 
0          00:00:00
1   00:00:00.100000
2   00:00:00.200000
3   00:00:00.400000
dtype: timedelta64[ns]

To get the number of microseconds, just divide the underlying nanoseconds by 1000:

要获得微秒数,只需将底层纳秒除以 1000:

In [15]: t.astype(np.int64) / 10**3
Out[15]: 
0         0
1    100000
2    200000
3    400000
dtype: int64

As Jeff points out, on recent versions of numpy you can divide by 1 micro second:

正如杰夫指出的那样,在最近版本的 numpy 上,您可以除以 1 微秒:

In [16]: t / np.timedelta64(1,'us')
Out[16]: 
0         0
1    100000
2    200000
3    400000
dtype: float64

回答by acushner

the easiest way is just to do this:

最简单的方法就是这样做:

(pd.to_datetime(df['time2']) - pd.to_datetime(df['time1'])) / np.timedelta64(1, 'us')'

(pd.to_datetime(df['time2']) - pd.to_datetime(df['time1'])) / np.timedelta64(1, 'us')'

回答by firelynx

At first I thought there was no correct answers here due to no green ticks. But as pointed out by Jeff in the comments, I was wrong.

起初我以为这里没有正确的答案,因为没有绿色的勾。但正如杰夫在评论中指出的那样,我错了。

Either way here is my contribution.

无论哪种方式,这里都是我的贡献。

First, the obvious, making the datetime.timeinto a timedelta

首先,显而易见,使之datetime.time成为一个timedelta

df['delta'] = (pd.to_timedelta(df.time2.astype(str)) - pd.to_timedelta(df.time1.astype(str)))

             time1            time2           delta
0  13:00:07.294234  13:00:07.294234        00:00:00
1  14:00:07.294234  14:00:07.394234 00:00:00.100000
2  15:00:07.294234  15:00:07.494234 00:00:00.200000
3  16:00:07.294234  16:00:07.694234 00:00:00.400000

Now that we have the timedeltawe can simply divide it by one microsecond to get the number of microseconds.

现在我们有了 ,timedelta我们可以简单地将它除以一微秒以获得微秒数。

df['microsecond_delta'] = df.delta / pd.np.timedelta64(1, 'us')

             time1            time2           delta  microsecond_delta
0  13:00:07.294234  13:00:07.294234        00:00:00                  0
1  14:00:07.294234  14:00:07.394234 00:00:00.100000             100000
2  15:00:07.294234  15:00:07.494234 00:00:00.200000             200000
3  16:00:07.294234  16:00:07.694234 00:00:00.400000             400000

I have to add that this is very counter intuitive, but it seems it is the only way. There seem to be no way of accessing the milliseconds directly. I tried via applying lambda functions like:

我必须补充一点,这是非常违反直觉的,但似乎这是唯一的方法。似乎没有办法直接访问毫秒。我尝试通过应用 lambda 函数,例如:

df.delta.apply(lambda x: x.microseconds)
AttributeError: 'numpy.timedelta64' object has no attribute 'microseconds'

Same is true for seconds, nanoseconds, millisecondsand so on...

同样是真实的secondsnanosecondsmilliseconds等...

回答by PeterPanda

Using dateutil you could transform your timestamp columns to 'real' timestamps:

使用 dateutil 您可以将时间戳列转换为“真实”时间戳:

df.time1 = df.time1.apply(dateutil.parser.parse) df.time2 = df.time2.apply(dateutil.parser.parse)

df.time1 = df.time1.apply(dateutil.parser.parse) df.time2 = df.time2.apply(dateutil.parser.parse)

After that you want to define a new column like this:

之后,您要定义一个新列,如下所示:

df['delta'] = df.time2 - df.time1

df['delta'] = df.time2 - df.time1