使用 Pandas 转换 Excel 样式的日期

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38454403/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:36:54  来源:igfitidea点击:

Convert Excel style date with pandas

pythonexcelpandasdatedatetime

提问by sascha_luen

I have to parse an xml file which gives me datetimes in Excel style; for example: 42580.3333333333.

我必须解析一个 xml 文件,它以 Excel 样式为我提供日期时间;例如:42580.3333333333

Does Pandas provide a way to convert that number into a regular datetimeobject?

Pandas 是否提供了一种将该数字转换为常规datetime对象的方法?

回答by EdChum

OK I think the easiest thing is to construct a TimedeltaIndexfrom the floats and add this to the scalar datetime for 1900,1,1:

好的,我认为最简单的方法是TimedeltaIndex从浮点数构造一个并将其添加到标量日期时间1900,1,1

In [85]:
import datetime as dt
import pandas as pd
df = pd.DataFrame({'date':[42580.3333333333, 10023]})
df

Out[85]:
           date
0  42580.333333
1  10023.000000

In [86]:
df['real_date'] = pd.TimedeltaIndex(df['date'], unit='d') + dt.datetime(1900,1,1)
df

Out[86]:
           date                  real_date
0  42580.333333 2016-07-31 07:59:59.971200
1  10023.000000 1927-06-12 00:00:00.000000

OK it seems that excel is a bit weird with it's dates thanks @ayhan:

好吧,似乎 excel 的日期有点奇怪,谢谢@ayhan:

In [89]:
df['real_date'] = pd.TimedeltaIndex(df['date'], unit='d') + dt.datetime(1899, 12, 30)
df

Out[89]:
           date                  real_date
0  42580.333333 2016-07-29 07:59:59.971200
1  10023.000000 1927-06-10 00:00:00.000000

See related: How to convert a python datetime.datetime to excel serial date number

请参阅相关:如何将 python datetime.datetime 转换为 excel 序列日期号

回答by jpp

You can use the 3rd party xlrdlibrary before passing to pd.to_datetime:

您可以xlrd在传递给之前使用第 3 方库pd.to_datetime

import xlrd

def read_date(date):
    return xlrd.xldate.xldate_as_datetime(date, 0)

df = pd.DataFrame({'date':[42580.3333333333, 10023]})

df['new'] = pd.to_datetime(df['date'].apply(read_date), errors='coerce')

print(df)

           date                 new
0  42580.333333 2016-07-29 08:00:00
1  10023.000000 1927-06-10 00:00:00