pandas 重命名熊猫数据框的索引

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16591923/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 20:49:47  来源:igfitidea点击:

rename index of a pandas dataframe

pythonpandas

提问by user1486457

I have a pandas dataframe whose indices look like:

我有一个Pandas数据框,其索引如下所示:

df.index
['a_1', 'b_2', 'c_3', ... ]

I want to rename these indices to:

我想将这些索引重命名为:

['a', 'b', 'c', ... ]

How do I do this without specifying a dictionary with explicit keys for each index value?
I tried:

如何在不为每个索引值指定带有显式键的字典的情况下执行此操作?
我试过:

df.rename( index = lambda x: x.split( '_' )[0] )

but this throws up an error:

但这会引发错误:

AssertionError: New axis must be unique to rename

回答by unutbu

Perhaps you could get the best of both worlds by using a MultiIndex:

也许您可以通过使用 MultiIndex 来两全其美:

import numpy as np
import pandas as pd
df = pd.DataFrame(np.arange(8).reshape(4,2), index=['a_1', 'b_2', 'c_3', 'c_4'])
print(df)
#      0  1
# a_1  0  1
# b_2  2  3
# c_3  4  5
# c_4  6  7

index = pd.MultiIndex.from_tuples([item.split('_') for item in df.index])
df.index = index
print(df)
#      0  1
# a 1  0  1
# b 2  2  3
# c 3  4  5
#   4  6  7

This way, you can access things according to first level of the index:

这样,您可以根据索引的第一级访问事物:

In [30]: df.ix['c']
Out[30]: 
   0  1
3  4  5
4  6  7

or according to both levels of the index:

或根据指数的两个级别:

In [31]: df.ix[('c','3')]
Out[31]: 
0    4
1    5
Name: (c, 3)

Moreover, all the DataFrame methods are built to work with DataFrames with MultiIndices, so you lose nothing.

此外,所有 DataFrame 方法都构建为与带有 MultiIndices 的 DataFrame 一起使用,因此您不会丢失任何东西。

However, if you really want to drop the second level of the index, you could do this:

但是,如果您真的想删除索引的第二级,您可以这样做:

df.reset_index(level=1, drop=True, inplace=True)
print(df)
#    0  1
# a  0  1
# b  2  3
# c  4  5
# c  6  7

回答by DSM

That's the error you'd get if your function produced duplicate index values:

如果您的函数产生重复的索引值,则会出现以下错误:

>>> df = pd.DataFrame(np.random.random((4,3)),index="a_1 b_2 c_3 c_4".split())
>>> df
            0         1         2
a_1  0.854839  0.830317  0.046283
b_2  0.433805  0.629118  0.702179
c_3  0.390390  0.374232  0.040998
c_4  0.667013  0.368870  0.637276
>>> df.rename(index=lambda x: x.split("_")[0])
[...]
AssertionError: New axis must be unique to rename

If you really want that, I'd use a list comp:

如果你真的想要那个,我会使用一个列表组合:

>>> df.index = [x.split("_")[0] for x in df.index]
>>> df
          0         1         2
a  0.854839  0.830317  0.046283
b  0.433805  0.629118  0.702179
c  0.390390  0.374232  0.040998
c  0.667013  0.368870  0.637276

but I'd think about whether that's really the right direction.

但我会考虑这是否真的是正确的方向。