在 Pandas 中将多索引数据帧解堆栈为平面数据帧

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/22779516/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 21:52:44  来源:igfitidea点击:

unstack multiindex dataframe to flat data frame in pandas

pythonpandasipython

提问by fgs

I have a multi index df called groupt3 in pandas which looks like this when I enter groupt3.head():

我在 Pandas 中有一个名为 groupt3 的多索引 df,当我输入 groupt3.head() 时它看起来像这样:

                datetime     song   sum   rat
artist datetime
2562     8      2            2      26    0
         46     19           19     26    0
         47     3            3      26    0
4Hero    1      2            2      32    0
         26     20           20     32    0
         9      10           10     32    0

I would like to have a "flat" data frame which took the artist index and the date time index and "repeats it" to form this:

我想要一个“平面”数据框,它采用艺术家索引和日期时间索引并“重复它”来形成:

artist     date time    song   sum   rat
2562       8            2      26    0
2562       46           19     26    0
2562       47           3      26    0

etc...

等等...

Thanks.

谢谢。

回答by fin

Using pandas.DataFrame.to_records().

使用pandas.DataFrame.to_records()

Example:

例子:

import pandas as pd
import numpy as np
arrays = [['Monday','Monday','Tursday','Tursday'],
                        ['Morning','Noon','Morning','Evening']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['Weekday', 'Time'])
df = pd.DataFrame(np.random.randint(5, size=(4,2)), index=index)

In [39]: df
Out[39]: 
                 0  1
Weekday Time         
Monday  Morning  1  3
        Noon     2  1
Tursday Morning  3  3
        Evening  1  2

In [40]: pd.DataFrame(df.to_records())
Out[40]: 
   Weekday     Time  0  1
0   Monday  Morning  1  3
1   Monday     Noon  2  1
2  Tursday  Morning  3  3
3  Tursday  Evening  1  2

回答by jezrael

I think you can use reset_index:

我认为你可以使用reset_index

import pandas as pd
import numpy as np

np.random.seed(0)
arrays = [['Monday','Monday','Tursday','Tursday'],
                        ['Morning','Noon','Morning','Evening']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['Weekday', 'Time'])
df = pd.DataFrame(np.random.randint(5, size=(4,2)), index=index)
print df
                 0  1
Weekday Time         
Monday  Morning  4  0
        Noon     3  3
Tursday Morning  3  1
        Evening  3  2

print df.reset_index()
   Weekday     Time  0  1
0   Monday  Morning  4  0
1   Monday     Noon  3  3
2  Tursday  Morning  3  1
3  Tursday  Evening  3  2