pandas 将数据框列名称从字符串格式更改为日期时间

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/41677850/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:47:29  来源:igfitidea点击:

Change dataframe column names from string format to datetime

pythonpandasdataframestring-to-datetime

提问by gtroupis

I have a dataframe where the names of the columns are dates (Year-month) in the form of strings. How can I convert these names in datetime format? I tried doing this:

我有一个数据框,其中列的名称是字符串形式的日期(年-月)。如何将这些名称转换为日期时间格式?我试过这样做:

new_cols = pd.to_datetime(df.columns)
df = df[new_cols]

but I get the error:

但我收到错误:

KeyError: "DatetimeIndex(
['2000-01-01', '2000-02-01',
 '2000-03-01', '2000-04-01',
 '2000-05-01', '2000-06-01', 
'2000-07-01', '2000-08-01',               
'2000-09-01', '2000-10-01',
'2015-11-01', '2015-12-01', 
'2016-01-01', '2016-02-01',
'2016-03-01', '2016-04-01', 
'2016-05-01', '2016-06-01',
'2016-07-01', '2016-08-01'],
dtype='datetime64[ns]', length=200, freq=None) not in index"

Thanks!

谢谢!

回答by jezrael

If select by loccolumns values was not changed, so get KeyError.

如果按loc列选择值未更改,则获取KeyError.

So you need assign output to columns:

所以你需要将输出分配给columns

df.columns = pd.to_datetime(df.columns)

Sample:

样本:

cols = ['2000-01-01', '2000-02-01', '2000-03-01', '2000-04-01', '2000-05-01']
vals = np.arange(5)
df = pd.DataFrame(columns = cols, data=[vals])
print (df)
   2000-01-01  2000-02-01  2000-03-01  2000-04-01  2000-05-01
0           0           1           2           3           4

print (df.columns)
Index(['2000-01-01', '2000-02-01', '2000-03-01', '2000-04-01', '2000-05-01'], dtype='object')

df.columns = pd.to_datetime(df.columns)

print (df.columns)
DatetimeIndex(['2000-01-01', '2000-02-01', '2000-03-01', '2000-04-01',
               '2000-05-01'],
              dtype='datetime64[ns]', freq=None)

Also is possible convert to period:

也可以转换为句点:

print (df.columns)
Index(['2000-01-01', '2000-02-01', '2000-03-01', '2000-04-01', '2000-05-01'], dtype='object')

df.columns = pd.to_datetime(df.columns).to_period('M')

print (df.columns)
PeriodIndex(['2000-01', '2000-02', '2000-03', '2000-04', '2000-05'],
             dtype='period[M]', freq='M')

回答by Fred Cascarini

As an expansion to jezrael's answer, the original code will be trying to slice the df array by the array stored in new_cols and store the result as df - but since those values don't exist in df yet it returns an error saying it can't find that index to slice by.

作为对 jezrael 答案的扩展,原始代码将尝试通过存储在 new_cols 中的数组对 df 数组进行切片并将结果存储为 df - 但由于这些值在 df 中不存在但它返回一个错误,表示它可以' t 找到要切片的索引。

As such you need to declare that you're changing the name of the columns, as in jezrael's answer.

因此,您需要声明您正在更改列的名称,如 jezrael 的回答。