pandas 重命名熊猫数据框的索引

Question

提问by user1486457

I have a pandas dataframe whose indices look like:

我有一个Pandas数据框，其索引如下所示：

df.index
['a_1', 'b_2', 'c_3', ... ]

I want to rename these indices to:

我想将这些索引重命名为：

['a', 'b', 'c', ... ]

How do I do this without specifying a dictionary with explicit keys for each index value?
I tried:

如何在不为每个索引值指定带有显式键的字典的情况下执行此操作？
我试过：

df.rename( index = lambda x: x.split( '_' )[0] )

but this throws up an error:

但这会引发错误：

AssertionError: New axis must be unique to rename

Answer 1

回答by unutbu

Perhaps you could get the best of both worlds by using a MultiIndex:

也许您可以通过使用 MultiIndex 来两全其美：

import numpy as np
import pandas as pd
df = pd.DataFrame(np.arange(8).reshape(4,2), index=['a_1', 'b_2', 'c_3', 'c_4'])
print(df)
#      0  1
# a_1  0  1
# b_2  2  3
# c_3  4  5
# c_4  6  7

index = pd.MultiIndex.from_tuples([item.split('_') for item in df.index])
df.index = index
print(df)
#      0  1
# a 1  0  1
# b 2  2  3
# c 3  4  5
#   4  6  7

This way, you can access things according to first level of the index:

这样，您可以根据索引的第一级访问事物：

In [30]: df.ix['c']
Out[30]: 
   0  1
3  4  5
4  6  7

or according to both levels of the index:

或根据指数的两个级别：

In [31]: df.ix[('c','3')]
Out[31]: 
0    4
1    5
Name: (c, 3)

Moreover, all the DataFrame methods are built to work with DataFrames with MultiIndices, so you lose nothing.

此外，所有 DataFrame 方法都构建为与带有 MultiIndices 的 DataFrame 一起使用，因此您不会丢失任何东西。

However, if you really want to drop the second level of the index, you could do this:

但是，如果您真的想删除索引的第二级，您可以这样做：

df.reset_index(level=1, drop=True, inplace=True)
print(df)
#    0  1
# a  0  1
# b  2  3
# c  4  5
# c  6  7

Answer 2

回答by DSM

That's the error you'd get if your function produced duplicate index values:

如果您的函数产生重复的索引值，则会出现以下错误：

>>> df = pd.DataFrame(np.random.random((4,3)),index="a_1 b_2 c_3 c_4".split())
>>> df
            0         1         2
a_1  0.854839  0.830317  0.046283
b_2  0.433805  0.629118  0.702179
c_3  0.390390  0.374232  0.040998
c_4  0.667013  0.368870  0.637276
>>> df.rename(index=lambda x: x.split("_")[0])
[...]
AssertionError: New axis must be unique to rename

If you really want that, I'd use a list comp:

如果你真的想要那个，我会使用一个列表组合：

>>> df.index = [x.split("_")[0] for x in df.index]
>>> df
          0         1         2
a  0.854839  0.830317  0.046283
b  0.433805  0.629118  0.702179
c  0.390390  0.374232  0.040998
c  0.667013  0.368870  0.637276

but I'd think about whether that's really the right direction.

但我会考虑这是否真的是正确的方向。

pandas 重命名熊猫数据框的索引

提问by user1486457

回答by unutbu

回答by DSM

相关推荐

最近更新

标签

pandas 重命名熊猫数据框的索引

提问by user1486457

回答by unutbu

回答by DSM

相关推荐

在 Pandas 中用标量乘以列

Pandas 数据框作为 matplotlib.pyplot.boxplot 的输入

pandas 以相反的顺序遍历 DataFrame 行索引

pandas 熊猫中的条件替换

相关推荐

最近更新

标签