在 Pandas 中创建 DateTimeIndex
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/36506149/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Create DateTimeIndex in Pandas
提问by guilhermecgs
I am having a hard time using panda for the first time
我第一次使用panda时遇到了困难
I have a dataframe containing year, month, day and hour in separated columns.
我有一个数据框,其中包含分隔列中的年、月、日和小时。
As far as i know, this dataframe is not indexed.
据我所知,这个数据框没有被索引。
I am trying to create a datetime index to this dataframe:
我正在尝试为此数据框创建日期时间索引:
def createTimeStamp(year, month, day, hour):
return DatetimeIndex(datetime(.........))
df['TimeStamp'] = df.apply(createTimeStamp(df['year'], df['month'], df['day'], df['hour']))
df.set_index('TimeStamp')
What I am doing wrong?
我做错了什么?
回答by Alexander
import datetime as dt
import pandas as pd
df = pd.DataFrame({'year': [2015, 2016],
'month': [12, 1],
'day': [31, 1],
'hour': [23, 1]})
# returns datetime objects
df['Timestamp'] = df.apply(lambda row: dt.datetime(row.year, row.month, row.day, row.hour),
axis=1)
# converts to pandas timestamps if desired
df['Timestamp'] = pd.to_datetime(df.Timestamp)
>>> df
day hour month year Timestamp
0 31 23 12 2015 2015-12-31 23:00:00
1 1 1 1 2016 2016-01-01 01:00:00
# Create a DatetimeIndex and assign it to the dataframe.
df.index = pd.DatetimeIndex(df.Timestamp)
>>> df
day hour month year Timestamp
2015-12-31 23:00:00 31 23 12 2015 2015-12-31 23:00:00
2016-01-01 01:00:00 1 1 1 2016 2016-01-01 01:00:00
回答by Colby Gerik
The issue is that set_index modifies a copy of the DataFrame. If you pass inplace=True to set_index the original DataFrame will be updated. Alternatively the DataFrame can be reassigned if more operations are needed
问题是 set_index 修改了 DataFrame 的副本。如果您将 inplace=True 传递给 set_index ,原始数据帧将被更新。或者,如果需要更多操作,可以重新分配 DataFrame
df.set_index('TimeStamp', inplace=True)
ordf = df.set_index('TimeStamp')
df.set_index('TimeStamp', inplace=True)
或者df = df.set_index('TimeStamp')