pandas 扁平化多索引列的简洁方法

Question

提问by Haleemur Ali

Using more than 1 function in a groupby-aggregate results in a multi-index which I then want to flatten.

在 groupby-aggregate 中使用超过 1 个函数会导致多索引，然后我想将其展平。

example:

例子：

df = pd.DataFrame(
    {'A': [1,1,1,2,2,2,3,3,3],
     'B': np.random.random(9),
     'C': np.random.random(9)}
)
out = df.groupby('A').agg({'B': [np.mean, np.std], 'C': np.median})

# example output

          B                   C
       mean       std    median
A
1  0.791846  0.091657  0.394167
2  0.156290  0.202142  0.453871
3  0.482282  0.382391  0.892514

Currently, I do it manually like this

目前，我像这样手动完成

out.columns = ['B_mean', 'B_std', 'C_median']

which gives me the result I want

这给了我想要的结果

     B_mean     B_std  C_median
A
1  0.791846  0.091657  0.394167
2  0.156290  0.202142  0.453871
3  0.482282  0.382391  0.892514

but I'm looking for a way to automate this process, as this is monotonous, time consuming and allows me to make typos as I rename the columns.

但我正在寻找一种方法来自动化这个过程，因为这是单调的、耗时的，并且允许我在重命名列时打错字。

Is there a way to return a flattened index instead of a multi-index when doing a groupby-aggregate?

在进行 groupby-aggregate 时，有没有办法返回扁平索引而不是多索引？

I need to flatten the columns to save to a text file, which will then be read by a different program that doesn't handle multi-indexed columns.

我需要展平列以保存到文本文件，然后由不处理多索引列的不同程序读取。

Answer 1

回答by YOBEN_S

You can do a mapjoinwith columns

你可以做一个mapjoin列

out.columns = out.columns.map('_'.join)
out
Out[23]: 
     B_mean     B_std  C_median
A                              
1  0.204825  0.169408  0.926347
2  0.362184  0.404272  0.224119
3  0.533502  0.380614  0.218105

For some reason (when the column contain int) I like this way better

出于某种原因（当列包含 int 时）我更喜欢这种方式

out.columns.map('{0[0]}_{0[1]}'.format) 
Out[27]: Index(['B_mean', 'B_std', 'C_median'], dtype='object')

Answer 2

回答by llllllllll

You can use:

您可以使用：

out.columns = list(map('_'.join, out.columns.values))

Answer 3

回答by Julio

Since version 0.24.0, you can just use to_flat_index.

从 0.24.0 版本开始，您可以只使用to_flat_index。

out.columns = [f"{x}_{y}" for x, y in out.columns.to_flat_index()]

    B_mean      B_std       C_median
A           
1   0.779592    0.137168    0.583211
2   0.158010    0.229234    0.550383
3   0.186771    0.150575    0.313409

pandas 扁平化多索引列的简洁方法

提问by Haleemur Ali

回答by YOBEN_S

回答by llllllllll

回答by Julio

相关推荐

最近更新

标签

pandas 扁平化多索引列的简洁方法

提问by Haleemur Ali

回答by YOBEN_S

回答by llllllllll

回答by Julio

相关推荐

pandas 从 read_csv 中提取文件名 - Python

pandas 如何使用pandas python获取数据框中每列的最大长度

将多个 csv 文件读入 Pandas 数据帧

pandas 将熊猫数据帧转换为 json 对象 - 熊猫

相关推荐

最近更新

标签