Pandas 在没有手动指定级别的情况下在多索引列上融化 (Python 3.5.1)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/36431413/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:00:07  来源:igfitidea点击:

Pandas Melt on Multi-index Columns Without Manually Specifying Levels (Python 3.5.1)

pythonpandas

提问by Vincent

I have a Pandas DataFrame that looks something like:

我有一个 Pandas DataFrame,它看起来像:

df = pd.DataFrame({'col1': {0: 'a', 1: 'b', 2: 'c'},
                   'col2': {0: 1, 1: 3, 2: 5},
                   'col3': {0: 2, 1: 4, 2: 6},
                   'col4': {0: 3, 1: 6, 2: 2},
                   'col5': {0: 7, 1: 2, 2: 3},
                   'col6': {0: 2, 1: 9, 2: 5},
                  })
df.columns = [list('AAAAAA'), list('BBCCDD'), list('EFGHIJ')]


    A
    B       C       D
    E   F   G   H   I   J
0   a   1   2   3   7   2
1   b   3   4   6   2   9
2   c   5   6   2   3   5

I basically just want to meltthe data frame so that each column level becomes a new column. In other words, I can achieve what I want pretty simply with pd.melt():

我基本上只是想要melt数据框,以便每个列级别成为一个新列。换句话说,我可以非常简单地实现我想要的pd.melt()

pd.melt(df, value_vars=[('A', 'B', 'E'),
                        ('A', 'B', 'F'),
                        ('A', 'C', 'G'),
                        ('A', 'C', 'H'),
                        ('A', 'D', 'I'),
                        ('A', 'D', 'J')])

However, in my real use-case, There are many initial columns (a lot more than 6), and it would be great if I could make this generalizable so I didn't have to precisely specify the tuples in value_vars. Is there a way to do this in a generalizable way? I'm basically looking for a way to tell pd.meltthat I just want to set value_varsto a list of tuples where in each tuple the first element is the first column level, the second is the second column level, and the third element is the third column level.

但是,在我的实际用例中,有许多初始列(远多于 6 个),如果我可以将其泛化就太好了,这样我就不必在value_vars. 有没有办法以通用的方式做到这一点?我基本上是在寻找一种方法来告诉pd.melt我我只想设置value_vars一个元组列表,其中每个元组中的第一个元素是第一列级别,第二个元素是第二列级别,第三个元素是第三列等级。

采纳答案by unutbu

If you don't specify value_vars, then all columns (that are not specified as id_vars) are used by default:

如果未指定value_vars,则id_vars默认使用所有列(未指定为):

In [10]: pd.melt(df)
Out[10]: 
   variable_0 variable_1 variable_2 value
0           A          B          E     a
1           A          B          E     b
2           A          B          E     c
3           A          B          F     1
4           A          B          F     3
...

However, if for some reason you do need to generate the list of column-tuples, you could use df.columns.tolist():

但是,如果由于某种原因确实需要生成列元组列表,则可以使用df.columns.tolist()

In [57]: df.columns.tolist()
Out[57]: 
[('A', 'B', 'E'),
 ('A', 'B', 'F'),
 ('A', 'C', 'G'),
 ('A', 'C', 'H'),
 ('A', 'D', 'I'),
 ('A', 'D', 'J')]

In [56]: pd.melt(df, value_vars=df.columns.tolist())
Out[56]: 
   variable_0 variable_1 variable_2 value
0           A          B          E     a
1           A          B          E     b
2           A          B          E     c
3           A          B          F     1
4           A          B          F     3
...

回答by pyrocarm

I had this same question, but my base dataset was actually just a series with 3-level Multi-Index. I found this answer to 'melt' a Series into a Dataframe from this blog post: https://discuss.analyticsvidhya.com/t/how-to-convert-the-multi-index-series-into-a-data-frame-in-python/5119/2

我有同样的问题,但我的基础数据集实际上只是一个具有 3 级多索引的系列。我从这篇博客文章中找到了将系列“融化”到数据帧中的答案:https://discuss.analyticsvidhya.com/t/how-to-convert-the-multi-index-series-into-a-data-框架蟒蛇/5119/2

Basically, you just use the DataFrame Constructor on the Series and it does exactly what you want Melt to do.

基本上,您只需在 Series 上使用 DataFrame 构造函数,它就会完全按照您的要求执行 Melt。

pd.DataFrame(series)