Python 嵌套字典到多索引数据框,其中字典键是列标签
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/24988131/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Nested dictionary to multiindex dataframe where dictionary keys are column labels
提问by pbreach
Say I have a dictionary that looks like this:
假设我有一本看起来像这样的字典:
dictionary = {'A' : {'a': [1,2,3,4,5],
'b': [6,7,8,9,1]},
'B' : {'a': [2,3,4,5,6],
'b': [7,8,9,1,2]}}
and I want a dataframe that looks something like this:
我想要一个看起来像这样的数据框:
A B
a b a b
0 1 6 2 7
1 2 7 3 8
2 3 8 4 9
3 4 9 5 1
4 5 1 6 2
Is there a convenient way to do this? If I try:
有没有方便的方法来做到这一点?如果我尝试:
In [99]:
DataFrame(dictionary)
Out[99]:
A B
a [1, 2, 3, 4, 5] [2, 3, 4, 5, 6]
b [6, 7, 8, 9, 1] [7, 8, 9, 1, 2]
I get a dataframe where each element is a list. What I need is a multiindex where each level corresponds to the keys in the nested dict and the rows corresponding to each element in the list as shown above. I think I can work a very crude solution but I'm hoping there might be something a bit simpler.
我得到一个数据框,其中每个元素都是一个列表。我需要的是一个多索引,其中每个级别对应于嵌套字典中的键以及对应于列表中每个元素的行,如上所示。我想我可以使用一个非常粗略的解决方案,但我希望可能有一些更简单的方法。
采纳答案by BrenBarn
Pandas wants the MultiIndex values as tuples, not nested dicts. The simplest thing is to convert your dictionary to the right format before trying to pass it to DataFrame:
Pandas 想要 MultiIndex 值作为元组,而不是嵌套的字典。最简单的方法是在尝试将字典传递给 DataFrame 之前将其转换为正确的格式:
>>> reform = {(outerKey, innerKey): values for outerKey, innerDict in dictionary.iteritems() for innerKey, values in innerDict.iteritems()}
>>> reform
{('A', 'a'): [1, 2, 3, 4, 5],
('A', 'b'): [6, 7, 8, 9, 1],
('B', 'a'): [2, 3, 4, 5, 6],
('B', 'b'): [7, 8, 9, 1, 2]}
>>> pandas.DataFrame(reform)
A B
a b a b
0 1 6 2 7
1 2 7 3 8
2 3 8 4 9
3 4 9 5 1
4 5 1 6 2
[5 rows x 4 columns]
回答by user8227892
dict_of_df = {k: pd.DataFrame(v) for k,v in dictionary.items()}
df = pd.concat(dict_of_df, axis=1)
Note that the order of columns is lost for python < 3.6
请注意,python < 3.6 的列顺序丢失
回答by Vira
This answer is a little late to the game, but...
这个答案有点晚了,但是......
You're looking for the functionality in .stack
:
您正在寻找以下功能.stack
:
df = pandas.DataFrame.from_dict(dictionary, orient="index").stack().to_frame()
# to break out the lists into columns
df = pd.DataFrame(df[0].values.tolist(), index=df.index)