pandas 如何做多列 from_tuples?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/37835508/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to do Multi-Column from_tuples?
提问by Little Bobby Tables
I get how to use pd.MultiIndex.from_tuples()
in order to change something like
我知道如何使用pd.MultiIndex.from_tuples()
以改变类似的东西
Value
(A,a) 1
(B,a) 2
(B,b) 3
into
进入
Value
Caps Lower
A a 1
B a 2
B b 3
But how do I change column tuples in the form
但是如何更改表单中的列元组
(A, a) (A, b) (B,a) (B,b)
index
1 1 2 2 3
2 2 3 3 2
3 3 4 4 1
into the form
进入表格
Caps A B
Lower a b a b
index
1 1 2 2 3
2 2 3 3 2
3 3 4 4 1
Many thanks.
非常感谢。
Edit:The reason I have a tuple column header is that when I joined a DataFrame with a single level column onto a DataFrame with a Multi-Level column it turned the Multi-Column into a tuple of strings format and left the single level as single string.
编辑:我有一个元组列标题的原因是,当我将具有单级列的 DataFrame 加入具有多级列的 DataFrame 时,它将多列转换为字符串格式的元组,并将单级保留为单级细绳。
Edit 2 - Alternate Solution:As stated the problem here arose via a join
with differing column level size. This meant the Multi-Column was reduced to a tuple of strings. The get around this issue, prior to the join I used df.columns = [('col_level_0','col_level_1','col_level_2')]
for the DataFrame I wished to join.
编辑 2 - 替代解决方案:如上所述,这里的问题是由于join
列级大小不同而引起的。这意味着多列被简化为一个字符串元组。在加入我用于df.columns = [('col_level_0','col_level_1','col_level_2')]
我希望加入的 DataFrame之前,解决这个问题。
回答by EdChum
Assign direct to columns
with the result from pd.MultiIndex.from_tuples
passing in your existing columns:
通过传入现有列columns
的结果直接分配给pd.MultiIndex.from_tuples
:
In [186]:
l=[('A', 'a'), ('A', 'b'), ('B','a'), ('B','b')]
df = pd.DataFrame(np.random.randn(5,4), columns = l)
df
Out[186]:
(A, a) (A, b) (B, a) (B, b)
0 -0.876353 0.553742 1.631858 -0.561309
1 0.463058 -0.455014 -0.491336 -1.436059
2 0.337810 0.233624 -0.571749 -2.259763
3 1.073057 -0.475894 0.999643 -0.379743
4 0.441800 0.311202 -0.191552 0.291268
In [187]:
df.columns = pd.MultiIndex.from_tuples(df.columns, names=['Caps','Lower'])
df
Out[187]:
Caps A B
Lower a b a b
0 -0.876353 0.553742 1.631858 -0.561309
1 0.463058 -0.455014 -0.491336 -1.436059
2 0.337810 0.233624 -0.571749 -2.259763
3 1.073057 -0.475894 0.999643 -0.379743
4 0.441800 0.311202 -0.191552 0.291268
note that you can assign directly to names
attribute of the columns
attribute like the following:
请注意,您可以直接分配给names
属性的columns
属性,如下所示:
df.columns.names = ['Caps','Lower']
not to be confused with the name
attribute
不要与name
属性混淆
回答by jezrael
Another solution is use MultiIndex.from_tuples
with parameter names
:
另一种解决方案是MultiIndex.from_tuples
与参数一起使用names
:
import pandas as pd
df = pd.DataFrame({'Value': [1,2,3]}, index=[('A','a'),('B','a'),('B','b')])
print (df)
Value
(A, a) 1
(B, a) 2
(B, b) 3
df.index = pd.MultiIndex.from_tuples(df.index, names=['Caps','Lower'])
print (df)
Value
Caps Lower
A a 1
B a 2
b 3
This same works with columns
, see Edchum's answer
:
这同样适用于columns
,请参阅Edchum's answer
:
df.columns= pd.MultiIndex.from_tuples(df.columns, names=['Caps','Lower'])