Python Pandas:多级列名
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21443963/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas: Multilevel column names
提问by LondonRob
pandashas support for multi-level column names:
pandas支持多级列名:
>>> x = pd.DataFrame({'instance':['first','first','first'],'foo':['a','b','c'],'bar':rand(3)})
>>> x = x.set_index(['instance','foo']).transpose()
>>> x.columns
MultiIndex
[(u'first', u'a'), (u'first', u'b'), (u'first', u'c')]
>>> x
instance first
foo a b c
bar 0.102885 0.937838 0.907467
This feature is very useful since it allows multiple versions of the same dataframe to be appended 'horizontally' with the 1st level of the column names (in my example instance) distinguishing the instances.
此功能非常有用,因为它允许将同一数据帧的多个版本“水平”附加到第一级列名(在我的示例中instance)以区分实例。
Imagine I already have a dataframe like this:
想象一下,我已经有一个这样的数据框:
a b c
bar 0.102885 0.937838 0.907467
Is there a nice way to add another level to the column names, similar to this for row index:
有没有一种很好的方法可以为列名添加另一个级别,类似于行索引:
x['instance'] = 'first'
x.set_level('instance',append=True)
采纳答案by Ian Zurutuza
No need to create a list of tuples
无需创建元组列表
Use: pd.MultiIndex.from_product(iterables)
用: pd.MultiIndex.from_product(iterables)
import pandas as pd
import numpy as np
df = pd.Series(np.random.rand(3), index=["a","b","c"]).to_frame().T
df.columns = pd.Multiindex.from_product([["new_label"], df.columns])
Resultant DataFrame:
结果数据帧:
new_label
a b c
0 0.25999 0.337535 0.333568
回答by user3377361
Try this:
尝试这个:
df=pd.DataFrame({'a':[1,2,3],'b':[4,5,6]})
columns=[('c','a'),('c','b')]
df.columns=pd.MultiIndex.from_tuples(columns)
回答by Carl
You can use concat. Give it a dictionary of dataframes where the key is the new column level you want to add.
您可以使用concat. 给它一个数据框字典,其中键是您要添加的新列级别。
In [46]: d = {}
In [47]: d['first_level'] = pd.DataFrame(columns=['idx', 'a', 'b', 'c'],
data=[[10, 0.89, 0.98, 0.31],
[20, 0.34, 0.78, 0.34]]).set_index('idx')
In [48]: pd.concat(d, axis=1)
Out[48]:
first_level
a b c
idx
10 0.89 0.98 0.31
20 0.34 0.78 0.34
You can use the same technique to create multiple levels.
您可以使用相同的技术来创建多个级别。
In [49]: d['second_level'] = pd.DataFrame(columns=['idx', 'a', 'b', 'c'],
data=[[10, 0.29, 0.63, 0.99],
[20, 0.23, 0.26, 0.98]]).set_index('idx')
In [50]: pd.concat(d, axis=1)
Out[50]:
first_level second_level
a b c a b c
idx
10 0.89 0.98 0.31 0.29 0.63 0.99
20 0.34 0.78 0.34 0.23 0.26 0.98
回答by Charl
Here is a function that can help you create the tuple, that can be used by pd.MultiIndex.from_tuples(), a bit more generically. Got the idea from @user3377361.
这是一个可以帮助您创建元组的函数,它可以由 pd.MultiIndex.from_tuples() 使用,更通用一点。从@user3377361 得到这个想法。
def create_tuple_for_for_columns(df_a, multi_level_col):
"""
Create a columns tuple that can be pandas MultiIndex to create multi level column
:param df_a: pandas dataframe containing the columns that must form the first level of the multi index
:param multi_level_col: name of second level column
:return: tuple containing (second_level_col, firs_level_cols)
"""
temp_columns = []
for item in df_a.columns:
temp_columns.append((multi_level_col, item))
return temp_columns
It can be used like this:
它可以像这样使用:
df = pd.DataFrame({'a':[1,2,3],'b':[4,5,6]})
columns = create_tuple_for_for_columns(df, 'c')
df.columns = pd.MultiIndex.from_tuples(columns)

