将 Pandas DataFrame 附加到现有 Excel 文档

Question

提问by enter_display_name_here

Per https://github.com/pandas-dev/pandas/pull/21251/files/09e5b456e1af5cde55f18f903ab90c761643b05a, we should be able to append DataFrames to new XLSX sheets.

根据https://github.com/pandas-dev/pandas/pull/21251/files/09e5b456e1af5cde55f18f903ab90c761643b05a，我们应该能够将 DataFrame 附加到新的 XLSX 表。

Based on the documentation, I tried the following:

根据文档，我尝试了以下操作：

>>> import pandas as pd
>>>                
... d1 = pd.DataFrame({"A":['Bob','Joe', 'Mark'], 
...                "B":['5', '10', '20']})
>>> d2 = pd.DataFrame({"A":['Jeffrey','Ann', 'Sue'], 
...                "B":['1', '2', '3']})
>>> 
>>> # Create XLSX document for ticker
... writer = pd.ExcelWriter('test.xlsx',engine='openpyxl')
>>> d1.to_excel(writer,sheet_name='d1')
>>> writer.save()
>>> 
>>> writer = pd.ExcelWriter('test.xlsx',engine='openpyxl', mode='a')
>>> d2.to_excel(writer,sheet_name='d2')
>>> writer.save()
>>> 
>>> pd.__version__
'0.23.4'     # Just updated this per a comment
>>> 
>>>

The result is a single workbook named 'test.xlsx' with a single tab 'd2'.

结果是一个名为“test.xlsx”的工作簿，带有一个选项卡“d2”。

How can I prevent the workbook/sheet form being overwritten?

如何防止工作簿/工作表表单被覆盖？

Answer 1

回答by Yo_Chris

You can use with:

您可以使用with：

with pd.ExcelWriter('test.xlsx', engine='openpyxl', mode='a') as writer:
    d1.to_excel(writer,sheet_name='d1')
    d2.to_excel(writer,sheet_name='d2')
    writer.save()

writer.close()

update

更新

This should work just note that the a blank file needs to be created before hand. You can just create a blank file using python if you want. I created a simple loop to, in some ways, mimic the essence of what you are trying to accomplish:

这应该起作用，请注意需要事先创建一个空白文件。如果需要，您可以使用 python 创建一个空白文件。我创建了一个简单的循环，以某种方式模仿您要完成的工作的本质：

import pandas as pd
from openpyxl import load_workbook

d1 = pd.DataFrame({"A":['Bob','Joe', 'Mark'], 
               "B":['5', '10', '20']})
d2 = pd.DataFrame({"A":['Jeffrey','Ann', 'Sue'], 
                "B":['1', '2', '3']})

dfs = [d1,d2]

for i in range(len(dfs)):
    sheet = 'd'+str(i+1)
    data = dfs[i]
    writer = pd.ExcelWriter('atest.xlsx',engine='openpyxl', mode='a')
    writer.book = load_workbook('atest.xlsx') # here is the difference
    data.to_excel(writer,sheet_name=sheet)
    writer.save()
    writer.close()

or here is the modified first example:

或者这里是修改后的第一个例子：

d1 = pd.DataFrame({"A":['Bob','Joe', 'Mark'], 
               "B":['5', '10', '20']})
d2 = pd.DataFrame({"A":['Jeffrey','Ann', 'Sue'], 
                "B":['1', '2', '3']})

writer = pd.ExcelWriter('atest.xlsx', engine='openpyxl', mode='w')
d1.to_excel(writer,sheet_name='d1')
writer.save()
writer.close()

writer = pd.ExcelWriter('atest.xlsx', engine='openpyxl', mode='a')
writer.book = load_workbook('atest.xlsx')
d2.to_excel(writer,sheet_name='d2')
writer.save()
writer.close()

Answer 2

回答by enter_display_name_here

I submitted a post on GitHub and received a response from the contributors (see the highlighted portion below). It turns out that this functionality won't be released until 0.24so it is not available in 0.23.1. FYI - I downloaded the RC and successfully tried out the mode='a'option. However, there may be a bug with workbooks that do not exist; I receive FileNotFoundError: [Errno 2] No such file or directory: 'test.xlsx'.

我在 GitHub 上提交了一篇文章，并收到了贡献者的回复（请参阅下面突出显示的部分）。事实证明，此功能0.24在0.23.1. 仅供参考 - 我下载了 RC 并成功试用了该mode='a'选项。但是，不存在的工作簿可能存在错误；我收到FileNotFoundError: [Errno 2] No such file or directory: 'test.xlsx'。

"this feature is being released as part of 0.24 which we just issued a release candidate for over the past few days. You can try on the RC or here on master and if neither works open an issue per the contributing guide, but this wouldn't be expected to work on versions older than that"

“此功能是作为 0.24 的一部分发布的，我们在过去几天刚刚发布了一个候选版本。您可以在 RC 或这里在 master 上尝试，如果两者都不起作用，则根据贡献指南打开一个问题，但这不会'预计不会在更早的版本上工作”

Answer 3

回答by Ishkatan

import pandas as pd

将Pandas导入为 pd

writer = pd.ExcelWriter(wk_path + save_file)
# ....
# build sc_files DataFrame and save. sc_files includes
# a column called OS.

sc_file.to_excel(writer, sheet_name='test')

# build data frame of OS counts out of sc_file
counts_os = sc_file.OS.value_counts() 

# To append to 'test' sheet, use startcol=x1, startrow=y
# To append counts_OS to the end of the current 'test' sheet
y = len(sc_file)
y += 1
counts_os.to_excel(writer, sheet_name='test', 
    startcol=1, startrow=y)

# write counts_os to sheet test2 
counts_os.to_excel(writer, sheet_name='test2')
writer.save()
writer.close()

将 Pandas DataFrame 附加到现有 Excel 文档

提问by enter_display_name_here

回答by Yo_Chris

update

更新

回答by enter_display_name_here

回答by Ishkatan

相关推荐

最近更新

标签

将 Pandas DataFrame 附加到现有 Excel 文档

提问by enter_display_name_here

回答by Yo_Chris

update

更新

回答by enter_display_name_here

回答by Ishkatan

相关推荐

AttributeError: 模块“pandas”没有属性“DataFrame”

使用 pd.eval() 在 Pandas 中进行动态表达式评估

pandas 熊猫中的聚合

如何避免解码为 str：在 Pandas 中需要类似字节的对象错误？

相关推荐

最近更新

标签