Python Pandas：多级列名

Question

提问by LondonRob

pandashas support for multi-level column names:

pandas支持多级列名：

>>>  x = pd.DataFrame({'instance':['first','first','first'],'foo':['a','b','c'],'bar':rand(3)})
>>> x = x.set_index(['instance','foo']).transpose()
>>> x.columns
MultiIndex
[(u'first', u'a'), (u'first', u'b'), (u'first', u'c')]
>>> x
instance     first                    
foo              a         b         c
bar       0.102885  0.937838  0.907467

This feature is very useful since it allows multiple versions of the same dataframe to be appended 'horizontally' with the 1st level of the column names (in my example instance) distinguishing the instances.

此功能非常有用，因为它允许将同一数据帧的多个版本“水平”附加到第一级列名（在我的示例中instance）以区分实例。

Imagine I already have a dataframe like this:

想象一下，我已经有一个这样的数据框：

                 a         b         c
bar       0.102885  0.937838  0.907467

Is there a nice way to add another level to the column names, similar to this for row index:

有没有一种很好的方法可以为列名添加另一个级别，类似于行索引：

x['instance'] = 'first'
x.set_level('instance',append=True)

Answer 1

采纳答案by Ian Zurutuza

No need to create a list of tuples

无需创建元组列表

Use: pd.MultiIndex.from_product(iterables)

用： pd.MultiIndex.from_product(iterables)

import pandas as pd
import numpy as np

df = pd.Series(np.random.rand(3), index=["a","b","c"]).to_frame().T
df.columns = pd.Multiindex.from_product([["new_label"], df.columns])

Resultant DataFrame:

结果数据帧：

  new_label                    
          a         b         c
0   0.25999  0.337535  0.333568

Pull request from Jan 25, 2014

2014 年 1 月 25 日的拉取请求

Answer 2

回答by user3377361

Try this:

尝试这个：

df=pd.DataFrame({'a':[1,2,3],'b':[4,5,6]})

columns=[('c','a'),('c','b')]

df.columns=pd.MultiIndex.from_tuples(columns)

Answer 3

回答by Carl

You can use concat. Give it a dictionary of dataframes where the key is the new column level you want to add.

您可以使用concat. 给它一个数据框字典，其中键是您要添加的新列级别。

In [46]: d = {}

In [47]: d['first_level'] = pd.DataFrame(columns=['idx', 'a', 'b', 'c'],
                                         data=[[10, 0.89, 0.98, 0.31],
                                               [20, 0.34, 0.78, 0.34]]).set_index('idx')

In [48]: pd.concat(d, axis=1)
Out[48]:
    first_level
              a     b     c
idx
10         0.89  0.98  0.31
20         0.34  0.78  0.34

You can use the same technique to create multiple levels.

您可以使用相同的技术来创建多个级别。

In [49]: d['second_level'] = pd.DataFrame(columns=['idx', 'a', 'b', 'c'],
                                          data=[[10, 0.29, 0.63, 0.99],
                                                [20, 0.23, 0.26, 0.98]]).set_index('idx')

In [50]: pd.concat(d, axis=1)
Out[50]:
    first_level             second_level
              a     b     c            a     b     c
idx
10         0.89  0.98  0.31         0.29  0.63  0.99
20         0.34  0.78  0.34         0.23  0.26  0.98

Answer 4

回答by Charl

Here is a function that can help you create the tuple, that can be used by pd.MultiIndex.from_tuples(), a bit more generically. Got the idea from @user3377361.

这是一个可以帮助您创建元组的函数，它可以由 pd.MultiIndex.from_tuples() 使用，更通用一点。从@user3377361 得到这个想法。

def create_tuple_for_for_columns(df_a, multi_level_col):
    """
    Create a columns tuple that can be pandas MultiIndex to create multi level column

    :param df_a: pandas dataframe containing the columns that must form the first level of the multi index
    :param multi_level_col: name of second level column
    :return: tuple containing (second_level_col, firs_level_cols)
    """
    temp_columns = []
    for item in df_a.columns:
        temp_columns.append((multi_level_col, item))
    return temp_columns

It can be used like this:

它可以像这样使用：

df = pd.DataFrame({'a':[1,2,3],'b':[4,5,6]})
columns = create_tuple_for_for_columns(df, 'c')
df.columns = pd.MultiIndex.from_tuples(columns)

Python Pandas：多级列名

提问by LondonRob

采纳答案by Ian Zurutuza

回答by user3377361

回答by Carl

回答by Charl

相关推荐

最近更新

标签

Python Pandas：多级列名

提问by LondonRob

采纳答案by Ian Zurutuza

回答by user3377361

回答by Carl

回答by Charl

相关推荐

Python SciPy 中的指数曲线拟合

Python 二维数组 matplotlib 的颜色图

Python：AttributeError：'NoneType'对象没有属性'findNext'

你如何在python中解码一个ascii字符串？

相关推荐

最近更新

标签