如何将 Pandas 数据框列从 np.datetime64 转换为 datetime?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37354498/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:15:50  来源:igfitidea点击:

How to convert a Pandas data frame column from np.datetime64 to datetime?

pythondatetimepandas

提问by helloB

I would like to put a Pandas Data Frame column into datetimeformat from datetime64. This works on an an individual basis. In particular the following works fine:

我想将 Pandas Data Frame 列datetimedatetime64. 这在个人基础上起作用。特别是以下工作正常:

t = dt['time'].values[0]
datetime.utcfromtimestamp(t.astype(int)/1000000000)

However, when I try to do this to the entire column

但是,当我尝试对整个列执行此操作时

dt['datetime'] = dt['time'].apply(lambda x: datetime.utcfromtimestamp(x.astype(int)/1000000000))

I get the following error:

我收到以下错误:

pandas/src/inference.pyx in pandas.lib.map_infer (pandas/lib.c:62578)()

pandas.lib.map_infer 中的 pandas/src/inference.pyx (pandas/lib.c:62578)()

<ipython-input-26-5950d82979b4> in <lambda>(x)
      1 print(type(dt['time'].values[0]))
      2 
----> 3 dt['datetime'] = dt['time'].apply(lambda x: datetime.utcfromtimestamp(x.astype(int)/1000000000))
      4 t = dt['time'].values[0]
      5 print(t)

AttributeError: 'Timestamp' object has no attribute 'astype'

What am I doing wrong? How can I convert my column to datetimeand/or make a new column in datetimeformat?

我究竟做错了什么?如何将我的列转换为datetime和/或以datetime格式创建新列?

Here is the info for the dataframe:

这是数据框的信息:

info

信息

回答by unutbu

You can convert Series of dtype datetime64[ns]to a NumPy array of datetime.datetimeobjects by calling the .dt.to_pydatetime()method:

您可以通过调用以下方法将系列 dtypedatetime64[ns]转换为 NumPydatetime.datetime对象数组.dt.to_pydatetime()

In [75]: df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 252 entries, 0 to 251
Data columns (total 1 columns):
time    252 non-null datetime64[ns]<--the `time` column has dtype `datetime64[ns]`
dtypes: datetime64[ns](1)
memory usage: 2.0 KB

In [77]: df.head()
Out[77]: 
        time
0 2009-01-02
1 2009-01-05
2 2009-01-06
3 2009-01-07
4 2009-01-08


In [76]: df['time'].dt.to_pydatetime()[:5]
Out[76]: 
array([datetime.datetime(2009, 1, 2, 0, 0),
       datetime.datetime(2009, 1, 5, 0, 0),
       datetime.datetime(2009, 1, 6, 0, 0),
       datetime.datetime(2009, 1, 7, 0, 0),
       datetime.datetime(2009, 1, 8, 0, 0)], dtype=object)


Note that NDFrames (such as Series and DataFrames) can only hold datetime-like objects as objects of dtype datetime64[ns]. The automatic conversion of all datetime-likes to a common dtype simplifies subsequent date computations. But it makes it impossible to store, say, Python datetime.datetimeobjects in a DataFrame column. Pandas core developer, Jeff Reback explains,

请注意,NDFrames(例如 Series 和 DataFrames)只能将类似日期时间的对象保存为 dtype 的对象datetime64[ns]。所有类似日期时间的自动转换为通用 dtype 简化了后续的日期计算。但是这使得datetime.datetime在 DataFrame 列中存储 Python对象变得不可能。Pandas 核心开发人员Jeff Reback 解释说

"We don't allow direct conversions because its simply too complicated to keep anything other than datetime64[ns] internally (nor necessary at all)."

“我们不允许直接转换,因为它太复杂了,无法在内部保留除 datetime64[ns] 以外的任何内容(根本没有必要)。”

回答by piRSquared

Without your data set, I have to guess at some things. But, you should be able to repeat the same thing as what you demonstrated as having worked.

没有你的数据集,我不得不猜测一些事情。但是,您应该能够重复与您展示的工作相同的事情。

dt['datetime'] = datetime.utcfromtimestamp(dt['time'].values.astype(int)/1000000000))