pandas 在熊猫中合并多索引数据框
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/40539377/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Merging multiindex dataframe in pandas
提问by agf1997
I have 2 similar data frames structured like this :
我有 2 个类似的数据框,结构如下:
ind = pd.MultiIndex.from_product([['Day 1','Day 2'],['D1','D2'],['Mean','StDev','StErr']], names = ['interval','device','stats'])
df = pd.DataFrame({'col1':[1,2,3,4,5,6,7,8,9,10,11,12]}, index = ind)
print(df)
col1
interval device stats
Day 1 D1 Mean 1
StDev 2
StErr 3
D2 Mean 4
StDev 5
StErr 6
Day 2 D1 Mean 7
StDev 8
StErr 9
D2 Mean 10
StDev 11
StErr 12
ind2 = pd.MultiIndex.from_product([['Day 1','Day 2'],['D1','D2'],['Ratio']], names = ['interval','device','stats'])
df2 = pd.DataFrame({'col1':[100,200,300,400]}, index = ind2)
print(df2)
col1
interval device stats
Day 1 D1 Ratio 100
D2 Ratio 200
Day 2 D1 Ratio 300
D2 Ratio 400
I'm trying to merge them to get this :
我正在尝试合并它们以获得这个:
col1
interval device stats
Day 1 D1 Mean 1
StDev 2
StErr 3
Ratio 100
D2 Mean 4
StDev 5
StErr 6
Ratio 200
Day 2 D1 Mean 7
StDev 8
StErr 9
Ratio 300
D2 Mean 10
StDev 11
StErr 12
Ratio 400
I tried a bunch of different things using join
, concat
, and merge
but the closest I've been able to get is using df3 = pd.concat([df, df2], axis=1)
. Unfortunately that gives me this :
我尝试了一堆不同的东西使用join
,concat
以及merge
但是最近我已经能够用得到df3 = pd.concat([df, df2], axis=1)
。不幸的是,这给了我这个:
col1 col1
interval device stats
Day 1 D1 Mean 1 NaN
Ratio NaN 100
StDev 2 NaN
StErr 3 NaN
D2 Mean 4 NaN
Ratio NaN 200
StDev 5 NaN
StErr 6 NaN
Day 2 D1 Mean 7 NaN
Ratio NaN 300
StDev 8 NaN
StErr 9 NaN
D2 Mean 10 NaN
Ratio NaN 400
StDev 11 NaN
StErr 12 NaN
回答by root
Don't use axis=1
when using concat
, as it means appending column-wise, not row-wise. You want axis=0
for row-wise, which happens to be the default, so you don't need to specify it:
使用axis=1
时不要使用concat
,因为这意味着按列添加,而不是按行添加。你想要axis=0
按行,这恰好是默认值,所以你不需要指定它:
df3 = pd.concat([df, df2]).sort_index()
The resulting output:
结果输出:
col1
interval device stats
Day 1 D1 Mean 1
Ratio 100
StDev 2
StErr 3
D2 Mean 4
Ratio 200
StDev 5
StErr 6
Day 2 D1 Mean 7
Ratio 300
StDev 8
StErr 9
D2 Mean 10
Ratio 400
StDev 11
StErr 12