在 Python 中使用 openpyxl 将行插入 Excel 电子表格

Question

提问by Nick

I'm looking for the best approach for inserting a row into a spreadsheet using openpyxl.

我正在寻找使用 openpyxl 在电子表格中插入一行的最佳方法。

Effectively, I have a spreadsheet (Excel 2007) which has a header row, followed by (at most) a few thousand rows of data. I'm looking to insert the row as the first row of actual data, so after the header. My understanding is that the append function is suitable for adding content to the endof the file.

实际上，我有一个电子表格（Excel 2007），它有一个标题行，后面跟着（最多）几千行数据。我希望将该行作为实际数据的第一行插入，因此在标题之后。我的理解是append函数适合在文件末尾添加内容。

Reading the documentation for both openpyxl and xlrd (and xlwt), I can't find any clear cut ways of doing this, beyond looping through the content manually and inserting into a new sheet (after inserting the required row).

阅读 openpyxl 和 xlrd（和 xlwt）的文档，除了手动循环内容并插入新工作表（插入所需行后）之外，我找不到任何明确的方法来执行此操作。

Given my so far limited experience with Python, I'm trying to understand if this is indeed the best option to take (the most pythonic!), and if so could someone provide an explicit example. Specifically can I read and write rows with openpyxl or do I have to access cells? Additionally can I (over)write the same file(name)?

鉴于我迄今为止对 Python 的经验有限，我试图了解这是否确实是最好的选择（最 Python 的！），如果是的话，有人可以提供一个明确的例子。具体来说，我可以使用 openpyxl 读写行还是必须访问单元格？另外我可以（覆盖）写入相同的文件（名称）吗？

Answer 1

采纳答案by Nick

Answering this with the code that I'm now using to achieve the desired result. Note that I am manually inserting the row at position 1, but that should be easy enough to adjust for specific needs. You could also easily tweak this to insert more than one row, and simply populate the rest of the data starting at the relevant position.

用我现在用来实现预期结果的代码来回答这个问题。请注意，我在位置 1 手动插入行，但这应该很容易根据特定需求进行调整。您还可以轻松调整它以插入多于一行，然后简单地从相关位置开始填充其余数据。

Also, note that due to downstream dependencies, we are manually specifying data from 'Sheet1', and the data is getting copied to a new sheet which is inserted at the beginning of the workbook, whilst renaming the original worksheet to 'Sheet1.5'.

另请注意，由于下游依赖关系，我们手动指定来自“Sheet1”的数据，并且将数据复制到插入工作簿开头的新工作表，同时将原始工作表重命名为“Sheet1.5” .

EDIT: I've also added (later on) a change to the format_code to fix issues where the default copy operation here removes all formatting: new_cell.style.number_format.format_code = 'mm/dd/yyyy'. I couldn't find any documentation that this was settable, it was more of a case of trial and error!

编辑：我还添加了（稍后）对 format_code 的更改，以修复此处的默认复制操作删除所有格式的问题：new_cell.style.number_format.format_code = 'mm/dd/yyyy'. 我找不到任何可以设置的文档，这更像是一个反复试验的案例！

Lastly, don't forget this example is saving over the original. You can change the save path where applicable to avoid this.

最后，不要忘记这个例子是保存在原始的之上。您可以在适用的情况下更改保存路径以避免这种情况。

    import openpyxl

    wb = openpyxl.load_workbook(file)
    old_sheet = wb.get_sheet_by_name('Sheet1')
    old_sheet.title = 'Sheet1.5'
    max_row = old_sheet.get_highest_row()
    max_col = old_sheet.get_highest_column()
    wb.create_sheet(0, 'Sheet1')

    new_sheet = wb.get_sheet_by_name('Sheet1')

    # Do the header.
    for col_num in range(0, max_col):
        new_sheet.cell(row=0, column=col_num).value = old_sheet.cell(row=0, column=col_num).value

    # The row to be inserted. We're manually populating each cell.
    new_sheet.cell(row=1, column=0).value = 'DUMMY'
    new_sheet.cell(row=1, column=1).value = 'DUMMY'

    # Now do the rest of it. Note the row offset.
    for row_num in range(1, max_row):
        for col_num in range (0, max_col):
            new_sheet.cell(row = (row_num + 1), column = col_num).value = old_sheet.cell(row = row_num, column = col_num).value

    wb.save(file)

Answer 2

回答by sedavidw

Unfortunately there isn't really a better way to do in that read in the file, and use a library like xlwt to write out a new excel file (with your new row inserted at the top). Excel doesn't work like a database that you can read and and append to. You unfortunately just have to read in the information and manipulate in memory and write out to what is essentially a new file.

不幸的是，在文件中读取并没有更好的方法，并使用像 xlwt 这样的库来写出一个新的 excel 文件（在顶部插入新行）。Excel 不像数据库那样工作，您可以读取和附加到其中。不幸的是，您只需要读入信息并在内存中操作并写出本质上是一个新文件的内容。

Answer 3

回答by Rejected

Openpyxl Worksheets have limited functionality when it comes to doing row or column level operations. The only properties a Worksheet has that relates to rows/columns are the properties row_dimensionsand column_dimensions, which store "RowDimensions" and "ColumnDimensions" objects for each row and column, respectively. These dictionaries are also used in function like get_highest_row()and get_highest_column().

在执行行级或列级操作时，Openpyxl 工作表的功能有限。工作表与行/列相关的唯一属性是属性row_dimensionsand column_dimensions，它们分别为每一行和列存储“RowDimensions”和“ColumnDimensions”对象。这些字典也用于像get_highest_row()和这样的函数get_highest_column()。

Everything else operates on a cell level, with Cell objects being tracked in the dictionary, _cells(and their style tracked in the dictionary _styles). Most functions that look like they're doing anything on a row or column level are actually operating on a range of cells (such as the aforementioned append()).

其他一切都在单元级别上运行，在字典中跟踪 Cell 对象_cells（以及在字典中跟踪它们的样式_styles）。大多数看起来像是在行或列级别上执行任何操作的函数实际上是在一系列单元格上运行的（例如前面提到的append()）。

The simplest thing to do would be what you suggested: create a new sheet, append your header row, append your new data rows, append your old data rows, delete the old sheet, then rename your new sheet to the old one. Problems that may be presented with this method is the loss of row/column dimensions attributes and cell styles, unless you specifically copy them, too.

最简单的做法就是按照您的建议：创建一个新工作表，附加标题行，附加新数据行，附加旧数据行，删除旧工作表，然后将新工作表重命名为旧工作表。这种方法可能带来的问题是行/列维度属性和单元格样式的丢失，除非您也特意复制它们。

Alternatively, you could create your own functions that insert rows or columns.

或者，您可以创建自己的函数来插入行或列。

I had a large number of verysimple worksheets that I needed to delete columns from. Since you asked for explicit examples, I'll provide the function I quickly threw together to do this:

我有大量非常简单的工作表，需要从中删除列。由于您要求提供明确的示例，因此我将提供我快速拼凑起来的功能：

from openpyxl.cell import get_column_letter

def ws_delete_column(sheet, del_column):

    for row_num in range(1, sheet.get_highest_row()+1):
        for col_num in range(del_column, sheet.get_highest_column()+1):

            coordinate = '%s%s' % (get_column_letter(col_num),
                                   row_num)
            adj_coordinate = '%s%s' % (get_column_letter(col_num + 1),
                                       row_num)

            # Handle Styles.
            # This is important to do if you have any differing
            # 'types' of data being stored, as you may otherwise get
            # an output Worksheet that's got improperly formatted cells.
            # Or worse, an error gets thrown because you tried to copy
            # a string value into a cell that's styled as a date.

            if adj_coordinate in sheet._styles:
                sheet._styles[coordinate] = sheet._styles[adj_coordinate]
                sheet._styles.pop(adj_coordinate, None)
            else:
                sheet._styles.pop(coordinate, None)

            if adj_coordinate in sheet._cells:
                sheet._cells[coordinate] = sheet._cells[adj_coordinate]
                sheet._cells[coordinate].column = get_column_letter(col_num)
                sheet._cells[coordinate].row = row_num
                sheet._cells[coordinate].coordinate = coordinate

                sheet._cells.pop(adj_coordinate, None)
            else:
                sheet._cells.pop(coordinate, None)

        # sheet.garbage_collect()

I pass it the worksheet that I'm working with, and the column number I want deleted, and away it goes. I know it isn't exactly what you wanted, but I hope this information helped!

我将我正在使用的工作表和我想要删除的列号传递给它，然后它就消失了。我知道这不是您想要的，但我希望这些信息对您有所帮助！

EDIT:Noticed someone gave this another vote, and figured I should update it. The co-ordinate system in Openpyxl experienced some changes sometime in the passed couple years, introducing a coordinateattribute for items in _cell. This needs to be edited, too, or the rows will be left blank (instead of deleted), and Excel will throw an error about problems with the file. This works for Openpyxl 2.2.3 (untested with later versions)

编辑：注意到有人再次投票，并认为我应该更新它。Openpyxl 中的坐标系统在过去几年中的某个时候经历了一些变化，coordinate为_cell. 这也需要编辑，否则行将留空（而不是删除），并且 Excel 将抛出有关文件问题的错误。这适用于 Openpyxl 2.2.3（未经更高版本测试）

Answer 4

回答by Dallas

== Updated to a fully functional version, based on feedback here: groups.google.com/forum/#!topic/openpyxl-users/wHGecdQg3Iw. ==

== 根据此处的反馈更新为功能齐全的版本：groups.google.com/forum/#!topic/openpyxl-users/wHGecdQg3Iw。==

As the others have pointed out, openpyxldoes not provide this functionality, but I have extended the Worksheetclass as follows to implement inserting rows. Hope this proves useful to others.

正如其他人所指出的，openpyxl不提供此功能，但我已Worksheet按如下方式扩展该类以实现插入行。希望这对其他人有用。

def insert_rows(self, row_idx, cnt, above=False, copy_style=True, fill_formulae=True):
    """Inserts new (empty) rows into worksheet at specified row index.

    :param row_idx: Row index specifying where to insert new rows.
    :param cnt: Number of rows to insert.
    :param above: Set True to insert rows above specified row index.
    :param copy_style: Set True if new rows should copy style of immediately above row.
    :param fill_formulae: Set True if new rows should take on formula from immediately above row, filled with references new to rows.

    Usage:

    * insert_rows(2, 10, above=True, copy_style=False)

    """
    CELL_RE  = re.compile("(?P<col>$?[A-Z]+)(?P<row>$?\d+)")

    row_idx = row_idx - 1 if above else row_idx

    def replace(m):
        row = m.group('row')
        prefix = "$" if row.find("$") != -1 else ""
        row = int(row.replace("$",""))
        row += cnt if row > row_idx else 0
        return m.group('col') + prefix + str(row)

    # First, we shift all cells down cnt rows...
    old_cells = set()
    old_fas   = set()
    new_cells = dict()
    new_fas   = dict()
    for c in self._cells.values():

        old_coor = c.coordinate

        # Shift all references to anything below row_idx
        if c.data_type == Cell.TYPE_FORMULA:
            c.value = CELL_RE.sub(
                replace,
                c.value
            )
            # Here, we need to properly update the formula references to reflect new row indices
            if old_coor in self.formula_attributes and 'ref' in self.formula_attributes[old_coor]:
                self.formula_attributes[old_coor]['ref'] = CELL_RE.sub(
                    replace,
                    self.formula_attributes[old_coor]['ref']
                )

        # Do the magic to set up our actual shift    
        if c.row > row_idx:
            old_coor = c.coordinate
            old_cells.add((c.row,c.col_idx))
            c.row += cnt
            new_cells[(c.row,c.col_idx)] = c
            if old_coor in self.formula_attributes:
                old_fas.add(old_coor)
                fa = self.formula_attributes[old_coor].copy()
                new_fas[c.coordinate] = fa

    for coor in old_cells:
        del self._cells[coor]
    self._cells.update(new_cells)

    for fa in old_fas:
        del self.formula_attributes[fa]
    self.formula_attributes.update(new_fas)

    # Next, we need to shift all the Row Dimensions below our new rows down by cnt...
    for row in range(len(self.row_dimensions)-1+cnt,row_idx+cnt,-1):
        new_rd = copy.copy(self.row_dimensions[row-cnt])
        new_rd.index = row
        self.row_dimensions[row] = new_rd
        del self.row_dimensions[row-cnt]

    # Now, create our new rows, with all the pretty cells
    row_idx += 1
    for row in range(row_idx,row_idx+cnt):
        # Create a Row Dimension for our new row
        new_rd = copy.copy(self.row_dimensions[row-1])
        new_rd.index = row
        self.row_dimensions[row] = new_rd
        for col in range(1,self.max_column):
            col = get_column_letter(col)
            cell = self.cell('%s%d'%(col,row))
            cell.value = None
            source = self.cell('%s%d'%(col,row-1))
            if copy_style:
                cell.number_format = source.number_format
                cell.font      = source.font.copy()
                cell.alignment = source.alignment.copy()
                cell.border    = source.border.copy()
                cell.fill      = source.fill.copy()
            if fill_formulae and source.data_type == Cell.TYPE_FORMULA:
                s_coor = source.coordinate
                if s_coor in self.formula_attributes and 'ref' not in self.formula_attributes[s_coor]:
                    fa = self.formula_attributes[s_coor].copy()
                    self.formula_attributes[cell.coordinate] = fa
                # print("Copying formula from cell %s%d to %s%d"%(col,row-1,col,row))
                cell.value = re.sub(
                    "($?[A-Z]{1,3}$?)%d"%(row - 1),
                    lambda m: m.group(1) + str(row),
                    source.value
                )   
                cell.data_type = Cell.TYPE_FORMULA

    # Check for Merged Cell Ranges that need to be expanded to contain new cells
    for cr_idx, cr in enumerate(self.merged_cell_ranges):
        self.merged_cell_ranges[cr_idx] = CELL_RE.sub(
            replace,
            cr
        )

Worksheet.insert_rows = insert_rows

Answer 5

回答by Ran S

I took Dallas solution and added support for merged cells:

我采用了达拉斯解决方案并添加了对合并单元格的支持：

    def insert_rows(self, row_idx, cnt, above=False, copy_style=True, fill_formulae=True):
        skip_list = []
        try:
            idx = row_idx - 1 if above else row_idx
            for (new, old) in zip(range(self.max_row+cnt,idx+cnt,-1),range(self.max_row,idx,-1)):
                for c_idx in range(1,self.max_column):
                  col = self.cell(row=1, column=c_idx).column #get_column_letter(c_idx)
                  print("Copying %s%d to %s%d."%(col,old,col,new))
                  source = self["%s%d"%(col,old)]
                  target = self["%s%d"%(col,new)]
                  if source.coordinate in skip_list:
                      continue

                  if source.coordinate in self.merged_cells:
                      # This is a merged cell
                      for _range in self.merged_cell_ranges:
                          merged_cells_list = [x for x in cells_from_range(_range)][0]
                          if source.coordinate in merged_cells_list:
                              skip_list = merged_cells_list
                              self.unmerge_cells(_range)
                              new_range = re.sub(str(old),str(new),_range)
                              self.merge_cells(new_range)
                              break

                  if source.data_type == Cell.TYPE_FORMULA:
                    target.value = re.sub(
                      "($?[A-Z]{1,3})%d"%(old),
                      lambda m: m.group(1) + str(new),
                      source.value
                    )
                  else:
                    target.value = source.value
                  target.number_format = source.number_format
                  target.font   = source.font.copy()
                  target.alignment = source.alignment.copy()
                  target.border = source.border.copy()
                  target.fill   = source.fill.copy()
            idx = idx + 1
            for row in range(idx,idx+cnt):
                for c_idx in range(1,self.max_column):
                  col = self.cell(row=1, column=c_idx).column #get_column_letter(c_idx)
                  #print("Clearing value in cell %s%d"%(col,row))
                  cell = self["%s%d"%(col,row)]
                  cell.value = None
                  source = self["%s%d"%(col,row-1)]
                  if copy_style:
                    cell.number_format = source.number_format
                    cell.font      = source.font.copy()
                    cell.alignment = source.alignment.copy()
                    cell.border    = source.border.copy()
                    cell.fill      = source.fill.copy()
                  if fill_formulae and source.data_type == Cell.TYPE_FORMULA:
                    #print("Copying formula from cell %s%d to %s%d"%(col,row-1,col,row))
                    cell.value = re.sub(
                      "($?[A-Z]{1,3})%d"%(row - 1),
                      lambda m: m.group(1) + str(row),
                      source.value
                    )

Answer 6

回答by mut3

Edited Nick's solution, this version takes a starting row, the number of rows to insert, and a filename, and inserts the necessary number of blank rows.

编辑尼克的解决方案，此版本采用起始行、要插入的行数和文件名，并插入必要数量的空白行。

#! python 3

import openpyxl, sys

my_start = int(sys.argv[1])
my_rows = int(sys.argv[2])
str_wb = str(sys.argv[3])

wb = openpyxl.load_workbook(str_wb)
old_sheet = wb.get_sheet_by_name('Sheet')
mcol = old_sheet.max_column
mrow = old_sheet.max_row
old_sheet.title = 'Sheet1.5'
wb.create_sheet(index=0, title='Sheet')

new_sheet = wb.get_sheet_by_name('Sheet')

for row_num in range(1, my_start):
    for col_num in range(1, mcol + 1):
        new_sheet.cell(row = row_num, column = col_num).value = old_sheet.cell(row = row_num, column = col_num).value

for row_num in range(my_start + my_rows, mrow + my_rows):
    for col_num in range(1, mcol + 1):
        new_sheet.cell(row = (row_num + my_rows), column = col_num).value = old_sheet.cell(row = row_num, column = col_num).value

wb.save(str_wb)

Answer 7

回答by aneroid

Adding an answer applicable to more recent releases, v2.5+, of openpyxl:

添加适用于更新版本 v2.5+ 的答案openpyxl：

There's now an insert_rows()and insert_cols().

现在有一个insert_rows()and insert_cols()。

insert_rows(idx, amount=1)
Insert row or rows before row==idx

insert_rows(idx, amount=1)
在 row==idx 之前插入一行或多行

Answer 8

回答by ack

This worked for me:

这对我有用：

    openpyxl.worksheet.worksheet.Worksheet.insert_rows(wbs,idx=row,amount=2)

Insert 2 rows before row==idx

在 row==idx 之前插入 2 行

See: http://openpyxl.readthedocs.io/en/stable/api/openpyxl.worksheet.worksheet.html

请参阅：http: //openpyxl.readthedocs.io/en/stable/api/openpyxl.worksheet.worksheet.html

Answer 9

回答by PrestonDocks

As of openpyxl 1.5 you can now use .insert_rows(idx, row_qty)

从 openpyxl 1.5 开始，您现在可以使用 .insert_rows(idx, row_qty)

from openpyxl import load_workbook
wb = load_workbook('excel_template.xlsx')
ws = wb.active
ws.insert_rows(14, 10)

It will not pick up the formatting of the idx row as it would if you did this manually in Excel. you will have apply the correct formatting i.e. cell color afterwards.

如果您在 Excel 中手动执行此操作，它不会选择 idx 行的格式。之后您将应用正确的格式，即单元格颜色。

Answer 10

回答by yugal sinha

To insert row into Excel spreadsheet using openpyxl in Python

在 Python 中使用 openpyxl 在 Excel 电子表格中插入行

Below code can help you :-

下面的代码可以帮助你：-

import openpyxl

file = "xyz.xlsx"
#loading XL sheet bassed on file name provided by user
book = openpyxl.load_workbook(file)
#opening sheet whose index no is 0
sheet = book.worksheets[0]

#insert_rows(idx, amount=1) Insert row or rows before row==idx, amount will be no of 
#rows you want to add and it's optional
sheet.insert_rows(13)

For inserting column also openpyxl have similar function i.e.insert_cols(idx, amount=1)

对于插入列，openpyxl 也有类似的功能 ieinsert_cols(idx, amount=1)

在 Python 中使用 openpyxl 将行插入 Excel 电子表格

提问by Nick

采纳答案by Nick

回答by sedavidw

回答by Rejected

回答by Dallas

回答by Ran S

回答by mut3

回答by aneroid

回答by ack

回答by PrestonDocks

回答by yugal sinha

相关推荐

最近更新

标签

在 Python 中使用 openpyxl 将行插入 Excel 电子表格

提问by Nick

采纳答案by Nick

回答by sedavidw

回答by Rejected

回答by Dallas

回答by Ran S

回答by mut3

回答by aneroid

回答by ack

回答by PrestonDocks

回答by yugal sinha

相关推荐

Python 使用 Conda 批量更新包

Python 美丽的汤得到 tag.id

如何使用 Python 仅读取 CSV 文件的标题列？

Python 崇高的 text3 和 virtualenvs

相关推荐

最近更新

标签