pandas 扁平化多索引列的简洁方法
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/50571793/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
concise way of flattening multiindex columns
提问by Haleemur Ali
Using more than 1 function in a groupby-aggregate results in a multi-index which I then want to flatten.
在 groupby-aggregate 中使用超过 1 个函数会导致多索引,然后我想将其展平。
example:
例子:
df = pd.DataFrame(
{'A': [1,1,1,2,2,2,3,3,3],
'B': np.random.random(9),
'C': np.random.random(9)}
)
out = df.groupby('A').agg({'B': [np.mean, np.std], 'C': np.median})
# example output
B C
mean std median
A
1 0.791846 0.091657 0.394167
2 0.156290 0.202142 0.453871
3 0.482282 0.382391 0.892514
Currently, I do it manually like this
目前,我像这样手动完成
out.columns = ['B_mean', 'B_std', 'C_median']
which gives me the result I want
这给了我想要的结果
B_mean B_std C_median
A
1 0.791846 0.091657 0.394167
2 0.156290 0.202142 0.453871
3 0.482282 0.382391 0.892514
but I'm looking for a way to automate this process, as this is monotonous, time consuming and allows me to make typos as I rename the columns.
但我正在寻找一种方法来自动化这个过程,因为这是单调的、耗时的,并且允许我在重命名列时打错字。
Is there a way to return a flattened index instead of a multi-index when doing a groupby-aggregate?
在进行 groupby-aggregate 时,有没有办法返回扁平索引而不是多索引?
I need to flatten the columns to save to a text file, which will then be read by a different program that doesn't handle multi-indexed columns.
我需要展平列以保存到文本文件,然后由不处理多索引列的不同程序读取。
回答by YOBEN_S
You can do a map
join
with columns
你可以做一个map
join
列
out.columns = out.columns.map('_'.join)
out
Out[23]:
B_mean B_std C_median
A
1 0.204825 0.169408 0.926347
2 0.362184 0.404272 0.224119
3 0.533502 0.380614 0.218105
For some reason (when the column contain int) I like this way better
出于某种原因(当列包含 int 时)我更喜欢这种方式
out.columns.map('{0[0]}_{0[1]}'.format)
Out[27]: Index(['B_mean', 'B_std', 'C_median'], dtype='object')
回答by llllllllll
You can use:
您可以使用:
out.columns = list(map('_'.join, out.columns.values))
回答by Julio
Since version 0.24.0, you can just use to_flat_index.
从 0.24.0 版本开始,您可以只使用to_flat_index。
out.columns = [f"{x}_{y}" for x, y in out.columns.to_flat_index()]
B_mean B_std C_median
A
1 0.779592 0.137168 0.583211
2 0.158010 0.229234 0.550383
3 0.186771 0.150575 0.313409