Python Pandas：修改特定级别的 Multiindex

Question

提问by

I have a dataframe with Multiindex and would like to modify one particular level of the Multiindex. For instance, the first level might be strings and I may want to remove the white spaces from that index level:

我有一个带有 Multiindex 的数据框，想修改 Multiindex 的一个特定级别。例如，第一级可能是字符串，我可能想从该索引级中删除空格：

df.index.levels[1] = [x.replace(' ', '') for x in df.index.levels[1]]

However, the code above results in an error:

但是，上面的代码导致错误：

TypeError: 'FrozenList' does not support mutable operations.

I know I can reset_index and modify the column and then re-create the Multiindex, but I wonder whether there is a more elegant way to modify one particular level of the Multiindex directly.

我知道我可以重置索引并修改列，然后重新创建多索引，但我想知道是否有更优雅的方法来直接修改多索引的一个特定级别。

Answer 1

回答by Shovalt

As mentioned in the comments, indexes are immutable and must be remade when modifying, but you do not have to use reset_indexfor that, you can create a new multi-index directly:

正如评论中提到的，索引是不可变的，修改时必须重新制作，但您不必为此使用reset_index，您可以直接创建一个新的多索引：

df.index = pd.MultiIndex.from_tuples([(x[0], x[1].replace(' ', ''), x[2]) for x in df.index])

This example is for a 3-level index, where you want to modify the middle level. You need to change the size of the tuple for different level sizes.

此示例针对 3 级索引，您要在其中修改中间级别。您需要针对不同级别的大小更改元组的大小。

Answer 2

回答by John

Thanks to @cxrodgers's comment, I think the fastest way to do this is:

感谢@cxrodgers 的评论，我认为最快的方法是：

df.index = df.index.set_levels(df.index.levels[0].str.replace(' ', ''), level=0)

Old, longer answer:

旧的，更长的答案：

I found that the list comprehension suggested by @Shovalt works but felt slow on my machine (using a dataframe with >10,000 rows).

我发现@Shovalt 建议的列表理解有效，但在我的机器上感觉很慢（使用 >10,000 行的数据框）。

Instead, I was able to use .set_levelsmethod, which was quite a bit faster for me.

相反，我能够使用.set_levels方法，这对我来说要快得多。

%timeit pd.MultiIndex.from_tuples([(x[0].replace(' ',''), x[1]) for x in df.index])
1 loop, best of 3: 394 ms per loop

%timeit df.index.set_levels(df.index.get_level_values(0).str.replace(' ',''), level=0)
10 loops, best of 3: 134 ms per loop

In actuality, I just needed to prepend some text. This was even faster with .set_levels:

实际上，我只需要预先添加一些文本。这甚至更快.set_levels：

%timeit pd.MultiIndex.from_tuples([('00'+x[0], x[1]) for x in df.index])
100 loops, best of 3: 5.18 ms per loop

%timeit df.index.set_levels('00'+df.index.get_level_values(0), level=0)
1000 loops, best of 3: 1.38 ms per loop

%timeit df.index.set_levels('00'+df.index.levels[0], level=0)
1000 loops, best of 3: 331 μs per loop

This solution is based on the answer in the link from the comment by @denfromufa ...

此解决方案基于@denfromufa 评论中链接中的答案...

python - Multiindex and timezone - Frozen list error - Stack Overflow

python - 多索引和时区 - 冻结列表错误 - VoidCC

Answer 3

回答by normanius

The answers provided are correct. Depending on the structure of the multi-index, it can be considerably faster to apply a map directly on the levels instead of constructing a new multi-index.

提供的答案是正确的。根据多索引的结构，直接在级别上应用地图而不是构建新的多索引会快得多。

I use the following function to modify a particular index level. It works also on single-level indices.

我使用以下函数来修改特定的索引级别。它也适用于单级索引。

def map_index_level(index, mapper, level=0):
    """
    Returns a new Index or MultiIndex, with the level values being mapped.
    """
    assert(isinstance(index, pd.Index))
    if isinstance(index, pd.MultiIndex):
        new_level = index.levels[level].map(mapper)
        new_index = index.set_levels(new_level, level=level)
    else:
        # Single level index.
        assert(level==0)
        new_index = index.map(mapper)
    return new_index

Usage:

用法：

df = pd.DataFrame([[1,2],[3,4]])
df.index = pd.MultiIndex.from_product([["a"],["i","ii"]])
df.columns = ["x","y"]

df.index = map_index_level(index=df.index, mapper=str.upper, level=1)
df.columns = map_index_level(index=df.columns, mapper={"x":"foo", "y":"bar"})

# Result:
#       foo  bar
# a I     1    2
#   II    3    4

Python Pandas：修改特定级别的 Multiindex

提问by

回答by Shovalt

回答by John

回答by normanius

相关推荐

最近更新

标签

Python Pandas：修改特定级别的 Multiindex

提问by

回答by Shovalt

回答by John

回答by normanius

相关推荐

OpenCV Python 中的等效 im2double 函数

Python获取具有特定扩展名的目录中的最新文件

类型错误：“函数”对象不可下标 - Python

Ubuntu：pip 不适用于 python3.4

相关推荐

最近更新

标签