pandas Python:如何将数据框字典变成一个大数据框,其中列名是前一个字典的键?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/35717706/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python: How to turn a dictionary of Dataframes into one big dataframe with column names being the key of the previous dict?
提问by pakkunrob
So my dataframe is made from lots of individual excel files, each with the the date as their file name and the prices of the fruits on that day in the spreadsheet, so the spreadsheets look something like this:
所以我的数据框是由许多单独的 excel 文件组成的,每个文件的文件名都是日期,电子表格中还有当天水果的价格,所以电子表格看起来像这样:
15012016:
Fruit Price
Orange 1
Apple 2
Pear 3
16012016:
Fruit Price
Orange 4
Apple 5
Pear 6
17012016:
Fruit Price
Orange 7
Apple 8
Pear 9
So to put all that information together I run the following code to put all the information into a dictionary of dataframes (all fruit price files stored in 'C:\Fruit_Prices_by_Day'
因此,为了将所有信息放在一起,我运行以下代码将所有信息放入数据框字典中(所有水果价格文件都存储在“C:\Fruit_Prices_by_Day”中)
#find all the file names
file_list = []
for x in os.listdir('C:\Fruit_Prices_by_Day'):
file_list.append(x)
file_list= list(set(file_list))
d = {}
for date in Raw_list:
df1 = pd.read_excel(os.path.join('C:\Fruit_Prices_by_Day', date +'.xlsx'), index_col = 'Fruit')
d[date] = df1
Then this is the part where I'm stuck. How do I then make this dict into a dataframe where the column names are the dict keys i.e. the dates, so I can get the price of each fruit per day all in the same dataframe like:
然后这是我被卡住的部分。然后我如何将这个 dict 变成一个数据框,其中列名是 dict 键,即日期,这样我就可以在同一个数据框中获得每天每个水果的价格,例如:
15012016 16012016 17012016
Orange 1 4 7
Apple 2 5 8
Pear 3 6 9
回答by jezrael
You can try first set_index
of all dataframes in comprehension
and then use concat
with remove last level of multiindex
in columns:
您可以首先尝试输入set_index
所有数据框comprehension
,然后使用concat
删除multiindex
列的最后一级:
print d
{'17012016': Fruit Price
0 Orange 7
1 Apple 8
2 Pear 9, '16012016': Fruit Price
0 Orange 4
1 Apple 5
2 Pear 6, '15012016': Fruit Price
0 Orange 1
1 Apple 2
2 Pear 3}
d = { k: v.set_index('Fruit') for k, v in d.items()}
df = pd.concat(d, axis=1)
df.columns = df.columns.droplevel(-1)
print df
15012016 16012016 17012016
Fruit
Orange 1 4 7
Apple 2 5 8
Pear 3 6 9
回答by Igor Fobia
Something like this could work: loop over the dictionary, add the constant column with the dictionary key, concatenate and then set the date as index
这样的事情可以工作:循环字典,使用字典键添加常量列,连接然后将日期设置为索引
pd.concat(
(i_value_df.assign(date=i_key) for i_key, i_value_df in d.items())
).set_index('date')