pandas 如何在熊猫中读取带有时区的日期时间
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18911241/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to read datetime with timezone in pandas
提问by palas
I am trying to create a dataframe from csv, and its first column is like
我正在尝试从 csv 创建一个数据框,它的第一列就像
"2013-08-25T00:00:00-0400";
"2013-08-25T01:00:00-0400";
"2013-08-25T02:00:00-0400";
"2013-08-25T03:00:00-0400";
"2013-08-25T04:00:00-0400";
It's datetime with timezone ! I already used something like
这是带时区的日期时间!我已经使用过类似的东西
df1 = DataFrame(pd.read_csv(PeriodC, sep=';', parse_dates=[0], index_col=0))
but the result was
但结果是
2013-09-02 04:00:00
2013-09-03 04:00:00
2013-09-04 04:00:00
2013-09-05 04:00:00
2013-09-06 04:00:00
2013-09-07 04:00:00
2013-09-08 04:00:00
Can anyone explain me how to seperate the datetime from timezone ?
谁能解释我如何将日期时间与时区分开?
回答by Viktor Kerkez
Pandas parser will take into account the timezone information if it's available, and give you a naive Timestamp (naive == no timezone info), but with the timezone offset taken into account.
Pandas 解析器将考虑时区信息(如果可用),并为您提供一个朴素的时间戳(朴素 == 无时区信息),但会考虑时区偏移。
To keep the timezone information in you DataFrame you should first localize the Timestamps as UTCand then convert them to their timezone (which in this case is Etc/GMT+4):
要在 DataFrame 中保留时区信息,您应该首先将时间戳本地化为UTC,然后将它们转换为它们的时区(在本例中为Etc/GMT+4):
>>> df = pd.read_csv(PeriodC, sep=';', parse_dates=[0], index_col=0)
>>> df.index[0]
>>> Timestamp('2013-08-25 04:00:00', tz=None)
>>> df.index = df.index.tz_localize('UTC').tz_convert('Etc/GMT+4')
>>> df.index[0]
Timestamp('2013-08-25 00:00:00-0400', tz='Etc/GMT+4')
If you want to completely discard the timezone information, then just specify a date_parserthat will split the string and pass only the datetime portion to the parser.
如果您想完全丢弃时区信息,则只需指定一个date_parser将拆分字符串并仅将日期时间部分传递给解析器的方法。
>>> df = pd.read_csv(file, sep=';', parse_dates=[0], index_col=[0]
date_parser=lambda x: pd.to_datetime(x.rpartition('-')[0]))
>>> df.index[0]
Timestamp('2013-08-25 00:00:00', tz=None)
回答by liangxinhui
The x.rpartition('-')from https://stackoverflow.com/a/18912631/4318671is not so good.
在 x.rpartition('-')从https://stackoverflow.com/a/18912631/4318671也不是那么好。
The string format of datetime get from Influxdb with 'Asia/Shanghai' will be:
从 Influxdb 获取的日期时间的字符串格式为“亚洲/上海”:
2019-09-09T12:51:54.46303+08:00
回答by liangxinhui
If you are using pandas, you can try
如果您正在使用pandas,您可以尝试
df['time'] = pd.to_datetime(df['time'])

