在 Pandas 中将多索引数据帧解堆栈为平面数据帧
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/22779516/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
unstack multiindex dataframe to flat data frame in pandas
提问by fgs
I have a multi index df called groupt3 in pandas which looks like this when I enter groupt3.head():
我在 Pandas 中有一个名为 groupt3 的多索引 df,当我输入 groupt3.head() 时它看起来像这样:
datetime song sum rat
artist datetime
2562 8 2 2 26 0
46 19 19 26 0
47 3 3 26 0
4Hero 1 2 2 32 0
26 20 20 32 0
9 10 10 32 0
I would like to have a "flat" data frame which took the artist index and the date time index and "repeats it" to form this:
我想要一个“平面”数据框,它采用艺术家索引和日期时间索引并“重复它”来形成:
artist date time song sum rat
2562 8 2 26 0
2562 46 19 26 0
2562 47 3 26 0
etc...
等等...
Thanks.
谢谢。
回答by fin
Using pandas.DataFrame.to_records().
使用pandas.DataFrame.to_records()。
Example:
例子:
import pandas as pd
import numpy as np
arrays = [['Monday','Monday','Tursday','Tursday'],
['Morning','Noon','Morning','Evening']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['Weekday', 'Time'])
df = pd.DataFrame(np.random.randint(5, size=(4,2)), index=index)
In [39]: df
Out[39]:
0 1
Weekday Time
Monday Morning 1 3
Noon 2 1
Tursday Morning 3 3
Evening 1 2
In [40]: pd.DataFrame(df.to_records())
Out[40]:
Weekday Time 0 1
0 Monday Morning 1 3
1 Monday Noon 2 1
2 Tursday Morning 3 3
3 Tursday Evening 1 2
回答by jezrael
I think you can use reset_index:
我认为你可以使用reset_index:
import pandas as pd
import numpy as np
np.random.seed(0)
arrays = [['Monday','Monday','Tursday','Tursday'],
['Morning','Noon','Morning','Evening']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['Weekday', 'Time'])
df = pd.DataFrame(np.random.randint(5, size=(4,2)), index=index)
print df
0 1
Weekday Time
Monday Morning 4 0
Noon 3 3
Tursday Morning 3 1
Evening 3 2
print df.reset_index()
Weekday Time 0 1
0 Monday Morning 4 0
1 Monday Noon 3 3
2 Tursday Morning 3 1
3 Tursday Evening 3 2

