Python 将熊猫 DateTimeIndex 转换为 Unix 时间？

Question

提问by Christian Geier

What is the idiomatic way of converting a pandas DateTimeIndex to (an iterable of) Unix Time? This is probably not the way to go:

将熊猫 DateTimeIndex 转换为（可迭代的）Unix 时间的惯用方法是什么？这可能不是要走的路：

[time.mktime(t.timetuple()) for t in my_data_frame.index.to_pydatetime()]

Answer 1

采纳答案by root

As DatetimeIndexis ndarrayunder the hood, you can do the conversion without a comprehension (much faster).

由于DatetimeIndex是ndarray引擎盖下，你可以做转换没有理解（要快得多）。

In [1]: import numpy as np

In [2]: import pandas as pd

In [3]: from datetime import datetime

In [4]: dates = [datetime(2012, 5, 1), datetime(2012, 5, 2), datetime(2012, 5, 3)]
   ...: index = pd.DatetimeIndex(dates)
   ...: 
In [5]: index.astype(np.int64)
Out[5]: array([1335830400000000000, 1335916800000000000, 1336003200000000000], 
        dtype=int64)

In [6]: index.astype(np.int64) // 10**9
Out[6]: array([1335830400, 1335916800, 1336003200], dtype=int64)

%timeit [t.value // 10 ** 9 for t in index]
10000 loops, best of 3: 119 us per loop

%timeit index.astype(np.int64) // 10**9
100000 loops, best of 3: 18.4 us per loop

Answer 2

回答by Andy Hayden

Note: Timestamp is just unix time with nanoseconds (so divide it by 10**9):

注意：时间戳只是带有纳秒的 unix 时间（因此将其除以 10**9）：

[t.value // 10 ** 9 for t in tsframe.index]

For example:

例如：

In [1]: t = pd.Timestamp('2000-02-11 00:00:00')

In [2]: t
Out[2]: <Timestamp: 2000-02-11 00:00:00>

In [3]: t.value
Out[3]: 950227200000000000L

In [4]: time.mktime(t.timetuple())
Out[4]: 950227200.0

As @root points out it's faster to extract the array of values directly:

正如@root 指出的那样，直接提取值数组会更快：

tsframe.index.astype(np.int64) // 10 ** 9

Answer 3

回答by Rani

A summary of other answers:

其他答案的总结：

df['<time_col>'].astype(np.int64) // 10**9

If you want to keep the milliseconds divide by 10**6instead

如果你想保持毫秒除以10**6代替

Answer 4

回答by Elias Hasle

Complementing the other answers: //10**9will do a flooring divide, which gives full past seconds rather than the nearest value in seconds. A simple way to get more reasonable rounding, if that is desired, is to add 5*10**8 - 1before doing the flooring divide.

补充其他答案：//10**9将做一个地板除法，它给出完整的过去秒而不是最接近的值（以秒为单位）。如果需要，获得更合理四舍五入的一种简单方法是5*10**8 - 1在进行地板除法之前添加。

Answer 5

回答by thomas

To address the case of NaT, which above solutions will convert to large negative ints, in pandas>=0.24 a possible solution would be:

为了解决 NaT 的情况，上述解决方案将转换为大的负整数，在 pandas>=0.24 中，一个可能的解决方案是：

def datetime_to_epoch(ser):
    """Don't convert NaT to large negative values."""
    if ser.hasnans:
        res = ser.dropna().astype('int64').astype('Int64').reindex(index=ser.index)
    else:
        res = ser.astype('int64')

    return res // 10**9

In the case of missing values this will return the nullable int type 'Int64' (ExtensionType pd.Int64Dtype):

在缺少值的情况下，这将返回可为空的 int 类型“Int64”（ExtensionType pd.Int64Dtype）：

In [5]: dt = pd.to_datetime(pd.Series(["2019-08-21", "2018-07-28", np.nan]))                                                                                                                                                                                                    
In [6]: datetime_to_epoch(dt)                                                                                                                                                                                                                                                   
Out[6]: 
0    1566345600
1    1532736000
2           NaN
dtype: Int64

Otherwise a regular int64:

否则是常规的 int64：

In [7]: datetime_to_epoch(dt[:2])                                                                                                                                                                                                                                               
Out[7]: 
0    1566345600
1    1532736000
dtype: int64

Answer 6

回答by Nour

If you have tried this on the datetime column of your dataframe:

如果您在数据框的日期时间列上尝试过此操作：

dframe['datetime'].astype(np.int64) // 10**9

& that you are struggling with the following error:TypeError: int() argument must be a string, a bytes-like object or a number, not 'Timestamp'you can just use these two lines :

＆您正在为以下错误而苦苦挣扎：TypeError: int() argument must be a string, a bytes-like object or a number, not 'Timestamp'您可以只使用这两行：

dframe.index = pd.DatetimeIndex(dframe['datetime'])
dframe['datetime']= dframe.index.astype(np.int64)// 10**9

Python 将熊猫 DateTimeIndex 转换为 Unix 时间？

提问by Christian Geier

采纳答案by root

回答by Andy Hayden

回答by Rani

回答by Elias Hasle

回答by thomas

回答by Nour

相关推荐

最近更新

标签

Python 将熊猫 DateTimeIndex 转换为 Unix 时间？

提问by Christian Geier

采纳答案by root

回答by Andy Hayden

回答by Rani

回答by Elias Hasle

回答by thomas

回答by Nour

相关推荐

Python Pandas 数据框获取每组的第一行

Python Pandas 获得每组中最高的 n 条记录

Python 语法错误：无效语法 end=''

Python 在 OSX 10.9 (Mavericks) 上安装 pyodbc 失败

相关推荐

最近更新

标签