pandas 如何在熊猫中读取带有时区的日期时间

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18911241/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 21:10:17  来源:igfitidea点击:

How to read datetime with timezone in pandas

pythondatetimepandas

提问by palas

I am trying to create a dataframe from csv, and its first column is like

我正在尝试从 csv 创建一个数据框,它的第一列就像

"2013-08-25T00:00:00-0400";
"2013-08-25T01:00:00-0400";
"2013-08-25T02:00:00-0400";
"2013-08-25T03:00:00-0400";
"2013-08-25T04:00:00-0400";

It's datetime with timezone ! I already used something like

这是带时区的日期时间!我已经使用过类似的东西

df1 = DataFrame(pd.read_csv(PeriodC, sep=';', parse_dates=[0], index_col=0))

but the result was

但结果是

2013-09-02 04:00:00                                                                                    
2013-09-03 04:00:00                                                                                     
2013-09-04 04:00:00                                                                                     
2013-09-05 04:00:00                                                                                      
2013-09-06 04:00:00                                                                                     
2013-09-07 04:00:00                                                                                     
2013-09-08 04:00:00

Can anyone explain me how to seperate the datetime from timezone ?

谁能解释我如何将日期时间与时区分开?

回答by Viktor Kerkez

Pandas parser will take into account the timezone information if it's available, and give you a naive Timestamp (naive == no timezone info), but with the timezone offset taken into account.

Pandas 解析器将考虑时区信息(如果可用),并为您提供一个朴素的时间戳(朴素 == 无时区信息),但会考虑时区偏移。

To keep the timezone information in you DataFrame you should first localize the Timestamps as UTCand then convert them to their timezone (which in this case is Etc/GMT+4):

要在 DataFrame 中保留时区信息,您应该首先将时间戳本地化为UTC,然后将它们转换为它们的时区(在本例中为Etc/GMT+4):

>>> df = pd.read_csv(PeriodC, sep=';', parse_dates=[0], index_col=0)
>>> df.index[0]
>>> Timestamp('2013-08-25 04:00:00', tz=None)
>>> df.index = df.index.tz_localize('UTC').tz_convert('Etc/GMT+4')
>>> df.index[0]
Timestamp('2013-08-25 00:00:00-0400', tz='Etc/GMT+4')

If you want to completely discard the timezone information, then just specify a date_parserthat will split the string and pass only the datetime portion to the parser.

如果您想完全丢弃时区信息,则只需指定一个date_parser将拆分字符串并仅将日期时间部分传递给解析器的方法。

>>> df = pd.read_csv(file, sep=';', parse_dates=[0], index_col=[0]
                     date_parser=lambda x: pd.to_datetime(x.rpartition('-')[0]))
>>> df.index[0]
Timestamp('2013-08-25 00:00:00', tz=None)

回答by liangxinhui

The x.rpartition('-')from https://stackoverflow.com/a/18912631/4318671is not so good.

x.rpartition('-')https://stackoverflow.com/a/18912631/4318671也不是那么好。

The string format of datetime get from Influxdb with 'Asia/Shanghai' will be:

从 Influxdb 获取的日期时间的字符串格式为“亚洲/上海”:

2019-09-09T12:51:54.46303+08:00

回答by liangxinhui

If you are using pandas, you can try

如果您正在使用pandas,您可以尝试

df['time'] = pd.to_datetime(df['time'])