Python 如何在不覆盖数据的情况下写入现有的 excel 文件（使用 Pandas）？

Question

提问by BP_

I use pandas to write to excel file in the following fashion:

我使用 Pandas 以下列方式写入 excel 文件：

import pandas

writer = pandas.ExcelWriter('Masterfile.xlsx') 

data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])

writer.save()

Masterfile.xlsx already consists of number of different tabs. However, it does not yet contain "Main".

Masterfile.xlsx 已经包含许多不同的选项卡。但是，它还不包含“Main”。

Pandas correctly writes to "Main" sheet, unfortunately it also deletes all other tabs.

Pandas 正确写入“主”表，不幸的是它也会删除所有其他选项卡。

Answer 1

采纳答案by Ski

Pandas docs says it uses openpyxl for xlsx files. Quick look through the code in ExcelWritergives a clue that something like this might work out:

Pandas 文档说它对 xlsx 文件使用 openpyxl。快速浏览中的代码ExcelWriter给出了这样的事情可能会奏效的线索：

import pandas
from openpyxl import load_workbook

book = load_workbook('Masterfile.xlsx')
writer = pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl') 
writer.book = book

## ExcelWriter for some reason uses writer.sheets to access the sheet.
## If you leave it empty it will not know that sheet Main is already there
## and will create a new sheet.

writer.sheets = dict((ws.title, ws) for ws in book.worksheets)

data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])

writer.save()

Answer 2

回答by Manish Mehra

def append_sheet_to_master(self, master_file_path, current_file_path, sheet_name):
    try:
        master_book = load_workbook(master_file_path)
        master_writer = pandas.ExcelWriter(master_file_path, engine='openpyxl')
        master_writer.book = master_book
        master_writer.sheets = dict((ws.title, ws) for ws in master_book.worksheets)
        current_frames = pandas.ExcelFile(current_file_path).parse(pandas.ExcelFile(current_file_path).sheet_names[0],
                                                               header=None,
                                                               index_col=None)
        current_frames.to_excel(master_writer, sheet_name, index=None, header=False)

        master_writer.save()
    except Exception as e:
        raise e

This works perfectly fine only thing is that formatting of the master file(file to which we add new sheet) is lost.

这工作得很好，唯一的问题是主文件（我们添加新工作表的文件）的格式丢失了。

Answer 3

回答by mvbentes

With openpyxlversion 2.4.0and pandasversion 0.19.2, the process @ski came up with gets a bit simpler:

使用openpyxlversion2.4.0和pandasversion 0.19.2，@ski 提出的过程变得更简单了：

import pandas
from openpyxl import load_workbook

with pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl') as writer:
    writer.book = load_workbook('Masterfile.xlsx')
    data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])
#That's it!

Answer 4

回答by b2002

Old question, but I am guessing some people still search for this - so...

老问题，但我猜有些人仍在寻找这个 - 所以......

I find this method nice because all worksheets are loaded into a dictionary of sheet name and dataframe pairs, created by pandas with the sheetname=None option. It is simple to add, delete or modify worksheets between reading the spreadsheet into the dict format and writing it back from the dict. For me the xlsxwriter works better than openpyxl for this particular task in terms of speed and format.

我觉得这个方法很好，因为所有的工作表都被加载到一个由 Pandas 使用 sheetname=None 选项创建的工作表名称和数据框对的字典中。在将电子表格读入 dict 格式和从 dict 写回之间添加、删除或修改工作表很简单。对我来说，xlsxwriter 在速度和格式方面比 openpyxl 更适合这项特定任务。

Note: future versions of pandas (0.21.0+) will change the "sheetname" parameter to "sheet_name".

注意：pandas (0.21.0+) 的未来版本会将“sheetname”参数更改为“sheet_name”。

# read a single or multi-sheet excel file
# (returns dict of sheetname(s), dataframe(s))
ws_dict = pd.read_excel(excel_file_path,
                        sheetname=None)

# all worksheets are accessible as dataframes.

# easy to change a worksheet as a dataframe:
mod_df = ws_dict['existing_worksheet']

# do work on mod_df...then reassign
ws_dict['existing_worksheet'] = mod_df

# add a dataframe to the workbook as a new worksheet with
# ws name, df as dict key, value:
ws_dict['new_worksheet'] = some_other_dataframe

# when done, write dictionary back to excel...
# xlsxwriter honors datetime and date formats
# (only included as example)...
with pd.ExcelWriter(excel_file_path,
                    engine='xlsxwriter',
                    datetime_format='yyyy-mm-dd',
                    date_format='yyyy-mm-dd') as writer:

    for ws_name, df_sheet in ws_dict.items():
        df_sheet.to_excel(writer, sheet_name=ws_name)

For the example in the 2013 question:

对于 2013 年问题中的示例：

ws_dict = pd.read_excel('Masterfile.xlsx',
                        sheetname=None)

ws_dict['Main'] = data_filtered[['Diff1', 'Diff2']]

with pd.ExcelWriter('Masterfile.xlsx',
                    engine='xlsxwriter') as writer:

    for ws_name, df_sheet in ws_dict.items():
        df_sheet.to_excel(writer, sheet_name=ws_name)

Answer 5

回答by flyingmeatball

I know this is an older thread, but this is the first item you find when searching, and the above solutions don't work if you need to retain charts in a workbook that you already have created. In that case, xlwings is a better option - it allows you to write to the excel book and keeps the charts/chart data.

我知道这是一个较旧的线程，但这是您在搜索时找到的第一个项目，如果您需要在已创建的工作簿中保留图表，则上述解决方案不起作用。在这种情况下，xlwings 是更好的选择 - 它允许您写入 Excel 书籍并保留图表/图表数据。

simple example:

简单的例子：

import xlwings as xw
import pandas as pd

#create DF
months = ['2017-01','2017-02','2017-03','2017-04','2017-05','2017-06','2017-07','2017-08','2017-09','2017-10','2017-11','2017-12']
value1 = [x * 5+5 for x in range(len(months))]
df = pd.DataFrame(value1, index = months, columns = ['value1'])
df['value2'] = df['value1']+5
df['value3'] = df['value2']+5

#load workbook that has a chart in it
wb = xw.Book('C:\data\bookwithChart.xlsx')

ws = wb.sheets['chartData']

ws.range('A1').options(index=False).value = df

wb = xw.Book('C:\data\bookwithChart_updated.xlsx')

xw.apps[0].quit()

Answer 6

回答by Edward

writer = pd.ExcelWriter('prueba1.xlsx'engine='openpyxl',keep_date_col=True)

The "keep_date_col" hope help you

“keep_date_col”希望对你有帮助

Answer 7

回答by MaxU

Here is a helper function:

这是一个辅助函数：

def append_df_to_excel(filename, df, sheet_name='Sheet1', startrow=None,
                       truncate_sheet=False, 
                       **to_excel_kwargs):
    """
    Append a DataFrame [df] to existing Excel file [filename]
    into [sheet_name] Sheet.
    If [filename] doesn't exist, then this function will create it.

    Parameters:
      filename : File path or existing ExcelWriter
                 (Example: '/path/to/file.xlsx')
      df : dataframe to save to workbook
      sheet_name : Name of sheet which will contain DataFrame.
                   (default: 'Sheet1')
      startrow : upper left cell row to dump data frame.
                 Per default (startrow=None) calculate the last row
                 in the existing DF and write to the next row...
      truncate_sheet : truncate (remove and recreate) [sheet_name]
                       before writing DataFrame to Excel file
      to_excel_kwargs : arguments which will be passed to `DataFrame.to_excel()`
                        [can be dictionary]

    Returns: None
    """
    from openpyxl import load_workbook

    # ignore [engine] parameter if it was passed
    if 'engine' in to_excel_kwargs:
        to_excel_kwargs.pop('engine')

    writer = pd.ExcelWriter(filename, engine='openpyxl')

    # Python 2.x: define [FileNotFoundError] exception if it doesn't exist 
    try:
        FileNotFoundError
    except NameError:
        FileNotFoundError = IOError


    try:
        # try to open an existing workbook
        writer.book = load_workbook(filename)

        # get the last row in the existing Excel sheet
        # if it was not specified explicitly
        if startrow is None and sheet_name in writer.book.sheetnames:
            startrow = writer.book[sheet_name].max_row

        # truncate sheet
        if truncate_sheet and sheet_name in writer.book.sheetnames:
            # index of [sheet_name] sheet
            idx = writer.book.sheetnames.index(sheet_name)
            # remove [sheet_name]
            writer.book.remove(writer.book.worksheets[idx])
            # create an empty sheet [sheet_name] using old index
            writer.book.create_sheet(sheet_name, idx)

        # copy existing sheets
        writer.sheets = {ws.title:ws for ws in writer.book.worksheets}
    except FileNotFoundError:
        # file does not exist yet, we will create it
        pass

    if startrow is None:
        startrow = 0

    # write out the new sheet
    df.to_excel(writer, sheet_name, startrow=startrow, **to_excel_kwargs)

    # save the workbook
    writer.save()

NOTE:for Pandas < 0.21.0, replace sheet_namewith sheetname!

注意：对于 Pandas < 0.21.0，替换sheet_name为sheetname!

Usage examples:

用法示例：

append_df_to_excel('d:/temp/test.xlsx', df)

append_df_to_excel('d:/temp/test.xlsx', df, header=None, index=False)

append_df_to_excel('d:/temp/test.xlsx', df, sheet_name='Sheet2', index=False)

append_df_to_excel('d:/temp/test.xlsx', df, sheet_name='Sheet2', index=False, startrow=25)

Answer 8

回答by Pedro Machado

book = load_workbook(xlsFilename)
writer = pd.ExcelWriter(self.xlsFilename)
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df.to_excel(writer, sheet_name=sheetName, index=False)
writer.save()

Answer 9

回答by Will Ayd

Starting in pandas 0.24 you can simplify this with the modekeyword argument of ExcelWriter:

从 pandas 0.24 开始，您可以使用以下mode关键字参数简化此操作ExcelWriter：

import pandas as pd

with pd.ExcelWriter('the_file.xlsx', engine='openpyxl', mode='a') as writer: 
     data_filtered.to_excel(writer)

Answer 10

回答by blacksheep

There is a better solution in pandas 0.24:

pandas 0.24 中有一个更好的解决方案：

with pd.ExcelWriter(path, mode='a') as writer:
    s.to_excel(writer, sheet_name='another sheet', index=False)

before:

前：

after:

后：

so upgrade your pandas now:

所以现在升级你的熊猫：

pip install --upgrade pandas

Python 如何在不覆盖数据的情况下写入现有的 excel 文件（使用 Pandas）？

提问by BP_

采纳答案by Ski

回答by Manish Mehra

回答by mvbentes

回答by b2002

回答by flyingmeatball

回答by Edward

回答by MaxU

回答by Pedro Machado

回答by Will Ayd

回答by blacksheep

相关推荐

最近更新

标签

Python 如何在不覆盖数据的情况下写入现有的 excel 文件（使用 Pandas）？

提问by BP_

采纳答案by Ski

回答by Manish Mehra

回答by mvbentes

回答by b2002

回答by flyingmeatball

回答by Edward

回答by MaxU

回答by Pedro Machado

回答by Will Ayd

回答by blacksheep

相关推荐

为pyinstaller加载python27.dll错误时出错

Python中正则表达式中的groups()方法

Python scipy.misc 模块没有属性 imread？

Python 一张图表在散景中具有两个不同的 y 轴范围？

相关推荐

最近更新

标签