将许多 python pandas 数据框放入一个 excel 工作表
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/32957441/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Putting many python pandas dataframes to one excel worksheet
提问by nyan314sn
It is quite easy to add many pandas dataframes into excel work book as long as it is different worksheets. But, it is somewhat tricky to get many dataframes into one worksheet if you want to use pandas built-in df.to_excel functionality.
只要是不同的工作表,就可以很容易地将许多 Pandas 数据框添加到 Excel 工作簿中。但是,如果您想使用 Pandas 内置的 df.to_excel 功能,将许多数据帧放入一个工作表中有点棘手。
# Creating Excel Writer Object from Pandas
writer = pd.ExcelWriter('test.xlsx',engine='xlsxwriter')
workbook=writer.book
worksheet=workbook.add_worksheet('Validation')
df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)
another_df.to_excel(writer,sheet_name='Validation',startrow=20, startcol=0)
The above code won't work. You will get the error of
上面的代码不起作用。你会得到错误
Sheetname 'Validation', with case ignored, is already in use.
Now, I have experimented enough that I found a way to make it work.
现在,我已经进行了足够的实验,我找到了一种方法来使它工作。
writer = pd.ExcelWriter('test.xlsx',engine='xlsxwriter') # Creating Excel Writer Object from Pandas
workbook=writer.book
df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)
another_df.to_excel(writer,sheet_name='Validation',startrow=20, startcol=0)
This will work. So, my purpose of posting this question on stackoverflow is twofold. Firstly, I hope this will help someone if he/she is trying to put many dataframes into a single work sheet at excel.
这将起作用。所以,我在 stackoverflow 上发布这个问题的目的是双重的。首先,如果有人试图将许多数据框放入 excel 的单个工作表中,我希望这会对他/她有所帮助。
Secondly, Can someone help me understand the difference between those two blocks of code? It appears to me that they are pretty much the same except the first block of code created worksheet called "Validation" in advance while the second does not. I get that part.
其次,有人可以帮助我理解这两个代码块之间的区别吗?在我看来,它们几乎相同,除了第一个代码块预先创建了名为“验证”的工作表,而第二个没有。我明白了那部分。
What I don't understand is why should it be any different ? Even if I don't create the worksheet in advance, this line, the line right before the last one,
我不明白的是为什么它应该有所不同?即使我不提前创建工作表,这一行,就在最后一行之前,
df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)
will create a worksheet anyway. Consequently, by the time we reached the last line of code the worksheet "Validation" is already created as well in the second block of code. So, my question basically, why should the second block of code work while the first doesn't?
无论如何都会创建一个工作表。因此,当我们到达最后一行代码时,工作表“验证”也已在第二个代码块中创建。所以,我的问题基本上是,为什么第二个代码块应该工作而第一个代码块不工作?
Please also share if there is another way to put many dataframes into excel using the built-in df.to_excel functionality !!
如果有另一种方法可以使用内置的 df.to_excel 功能将许多数据框放入 excel 中,也请分享!
采纳答案by Adrian
To create the Worksheet in advance, you need to add the created sheet to the sheets
dict:
要提前创建工作表,您需要将创建的工作表添加到sheets
字典中:
writer.sheets['Validation'] = worksheet
writer.sheets['Validation'] = worksheet
Using your original code:
使用您的原始代码:
# Creating Excel Writer Object from Pandas
writer = pd.ExcelWriter('test.xlsx',engine='xlsxwriter')
workbook=writer.book
worksheet=workbook.add_worksheet('Validation')
writer.sheets['Validation'] = worksheet
df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)
another_df.to_excel(writer,sheet_name='Validation',startrow=20, startcol=0)
Explanation
解释
If we look at the pandas function to_excel
, it uses the writer's write_cells
function:
如果我们看一下 pandas 函数to_excel
,它使用了作者的write_cells
函数:
excel_writer.write_cells(formatted_cells, sheet_name, startrow=startrow, startcol=startcol)
So looking at the write_cells
function for xlsxwriter
:
所以看看write_cells
函数xlsxwriter
:
def write_cells(self, cells, sheet_name=None, startrow=0, startcol=0):
# Write the frame cells using xlsxwriter.
sheet_name = self._get_sheet_name(sheet_name)
if sheet_name in self.sheets:
wks = self.sheets[sheet_name]
else:
wks = self.book.add_worksheet(sheet_name)
self.sheets[sheet_name] = wks
Here we can see that it checks for sheet_name
in self.sheets
, and so it needs to be added there as well.
在这里我们可以看到它检查了sheet_name
in self.sheets
,因此它也需要添加到那里。
回答by TomDobbs
user3817518: "Please also share if there is another way to put many dataframes into excel using the built-in df.to_excel functionality !!"
user3817518:“如果有另一种方法可以使用内置的 df.to_excel 功能将许多数据帧放入 excel,请也分享一下!!”
Here's my attempt:
这是我的尝试:
Easy way to put together a lot of dataframes on just one sheet or across multiple tabs. Let me know if this works!
将大量数据框放在一张纸上或跨多个选项卡的简单方法。让我知道这个是否奏效!
-- To test, just run the sample dataframes and the second and third portion of code.
-- 要进行测试,只需运行示例数据帧以及代码的第二和第三部分。
Sample dataframes
示例数据帧
import pandas as pd
import numpy as np
# Sample dataframes
randn = np.random.randn
df = pd.DataFrame(randn(15, 20))
df1 = pd.DataFrame(randn(10, 5))
df2 = pd.DataFrame(randn(5, 10))
Put multiple dataframes into one xlsx sheet
将多个数据帧放入一张 xlsx 表中
# funtion
def multiple_dfs(df_list, sheets, file_name, spaces):
writer = pd.ExcelWriter(file_name,engine='xlsxwriter')
row = 0
for dataframe in df_list:
dataframe.to_excel(writer,sheet_name=sheets,startrow=row , startcol=0)
row = row + len(dataframe.index) + spaces + 1
writer.save()
# list of dataframes
dfs = [df,df1,df2]
# run function
multiple_dfs(dfs, 'Validation', 'test1.xlsx', 1)
Put multiple dataframes across separate tabs/sheets
将多个数据框放在单独的选项卡/工作表中
# function
def dfs_tabs(df_list, sheet_list, file_name):
writer = pd.ExcelWriter(file_name,engine='xlsxwriter')
for dataframe, sheet in zip(df_list, sheet_list):
dataframe.to_excel(writer, sheet_name=sheet, startrow=0 , startcol=0)
writer.save()
# list of dataframes and sheet names
dfs = [df, df1, df2]
sheets = ['df','df1','df2']
# run function
dfs_tabs(dfs, sheets, 'multi-test.xlsx')
回答by Alex
I would be more inclined to concatenate the dataframes first and then turn that dataframe into an excel format. To put two dataframes together side-by-side (as opposed to one above the other) do this:
我更倾向于先连接数据帧,然后将该数据帧转换为 excel 格式。要将两个数据帧并排(而不是一个在另一个之上)放在一起,请执行以下操作:
writer = pd.ExcelWriter('test.xlsx',engine='xlsxwriter') # Creating Excel Writer Object from Pandas
workbook=writer.book
df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)
new_df = pd.concat([df, another_df], axis=1)
new_df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)