pandas 如何做多列 from_tuples?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37835508/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:23:58  来源:igfitidea点击:

How to do Multi-Column from_tuples?

pandasmultiple-columnsmulti-indexcolumnname

提问by Little Bobby Tables

I get how to use pd.MultiIndex.from_tuples()in order to change something like

我知道如何使用pd.MultiIndex.from_tuples()以改变类似的东西

       Value
(A,a)  1
(B,a)  2
(B,b)  3

into

进入

                Value
Caps Lower      
A    a          1
B    a          2
B    b          3

But how do I change column tuples in the form

但是如何更改表单中的列元组

       (A, a)  (A, b) (B,a)  (B,b)
index
1      1       2      2      3
2      2       3      3      2
3      3       4      4      1

into the form

进入表格

 Caps         A              B
 Lower        a       b      a      b
 index
 1            1       2      2      3
 2            2       3      3      2
 3            3       4      4      1

Many thanks.

非常感谢。



Edit:The reason I have a tuple column header is that when I joined a DataFrame with a single level column onto a DataFrame with a Multi-Level column it turned the Multi-Column into a tuple of strings format and left the single level as single string.

编辑:我有一个元组列标题的原因是,当我将具有单级列的 DataFrame 加入具有多级列的 DataFrame 时,它​​将多列转换为字符串格式的元组,并将单级保留为单级细绳。



Edit 2 - Alternate Solution:As stated the problem here arose via a joinwith differing column level size. This meant the Multi-Column was reduced to a tuple of strings. The get around this issue, prior to the join I used df.columns = [('col_level_0','col_level_1','col_level_2')]for the DataFrame I wished to join.

编辑 2 - 替代解决方案:如上所述,这里的问题是由于join列级大小不同而引起的。这意味着多列被简化为一个字符串元组。在加入我用于df.columns = [('col_level_0','col_level_1','col_level_2')]我希望加入的 DataFrame之前,解决这个问题。

回答by EdChum

Assign direct to columnswith the result from pd.MultiIndex.from_tuplespassing in your existing columns:

通过传入现有列columns的结果直接分配给pd.MultiIndex.from_tuples

In [186]:
l=[('A', 'a'),  ('A', 'b'), ('B','a'),  ('B','b')]
df = pd.DataFrame(np.random.randn(5,4), columns = l)
df

Out[186]:
     (A, a)    (A, b)    (B, a)    (B, b)
0 -0.876353  0.553742  1.631858 -0.561309
1  0.463058 -0.455014 -0.491336 -1.436059
2  0.337810  0.233624 -0.571749 -2.259763
3  1.073057 -0.475894  0.999643 -0.379743
4  0.441800  0.311202 -0.191552  0.291268

In [187]:    
df.columns = pd.MultiIndex.from_tuples(df.columns, names=['Caps','Lower'])
df

Out[187]:
Caps          A                   B          
Lower         a         b         a         b
0     -0.876353  0.553742  1.631858 -0.561309
1      0.463058 -0.455014 -0.491336 -1.436059
2      0.337810  0.233624 -0.571749 -2.259763
3      1.073057 -0.475894  0.999643 -0.379743
4      0.441800  0.311202 -0.191552  0.291268

note that you can assign directly to namesattribute of the columnsattribute like the following:

请注意,您可以直接分配给names属性的columns属性,如下所示:

df.columns.names = ['Caps','Lower']

not to be confused with the nameattribute

不要与name属性混淆

回答by jezrael

Another solution is use MultiIndex.from_tupleswith parameter names:

另一种解决方案是MultiIndex.from_tuples与参数一起使用names

import pandas as pd

df = pd.DataFrame({'Value': [1,2,3]}, index=[('A','a'),('B','a'),('B','b')])
print (df)
        Value
(A, a)      1
(B, a)      2
(B, b)      3

df.index = pd.MultiIndex.from_tuples(df.index, names=['Caps','Lower'])
print (df)
            Value
Caps Lower       
A    a          1
B    a          2
     b          3

This same works with columns, see Edchum's answer:

这同样适用于columns,请参阅Edchum's answer

df.columns= pd.MultiIndex.from_tuples(df.columns, names=['Caps','Lower'])