Python 如何删除满足 Excel 电子表格中某些条件的行?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14904977/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 12:48:31  来源:igfitidea点击:

How to delete rows that satisfy some criteria in an excel spreadsheet?

pythonopenpyxl

提问by kjo

I would like to create a "reduced" version of an Excel (xlsx) spreadsheet (i.e. by removing some rows according to some criterion), and I'd like to know if this can be done with openpyxl.

我想创建一个 Excel (xlsx) 电子表格的“简化”版本(即根据某些标准删除一些行),我想知道这是否可以使用openpyxl.

In (pythonish) pseudo-code, what I want to do would look something like:

在(pythonish)伪代码中,我想做的事情看起来像:

wb = openpyxl.reader.excel.load_workbook('/path/to/workbook.xlsx')
sh = wb.get_sheet_by_name('someworksheet')

# weed out the rows of sh according to somecriterion
sh.rows[:] = [r for r in sh.rows if somecriterion(r)]

# save the workbook, with the weeded-out sheet
wb.save('/path/to/workbook_reduced.xlsx')

Can something like this be done with openpyxl, and if so, how?

可以用 做这样的事情openpyxl吗,如果可以,怎么做?

回答by Anthon

Internally openpyxldoes not seem to have a concept of 'rows' it works with cells and keeps track of the dimensions and if you use Worksheet.rowsit calculates a 2D array of cells from that. You can mutate the array, but it doesn't change the Worksheet.

内部openpyxl似乎没有“行”的概念,它与单元格一起工作并跟踪尺寸,如果您使用Worksheet.rows它,它会从中计算出一个 2D 单元格数组。您可以改变数组,但它不会改变工作表。

If you want to do this within the Worksheet, you need to copy the values from the old position to the new position, and set the value of the cells that are no longer needed to ''or Noneand call Worksheet.garbage_collect().

如果要在工作表中执行此操作,则需要将旧位置的值复制到新位置,并将不再需要的单元格的值设置为''None并调用Worksheet.garbage_collect()

If your dataset is small and of uniform nature (all strings e.g.), you might be better of copying the relevant cell (content) to a new worksheet, remove the old one and set the title of the new one to the title of the just deleted one.

如果您的数据集很小且具有统一的性质(例如所有字符串),您可能最好将相关单元格(内容)复制到新工作表中,删除旧的并将新的标题设置为刚刚的标题删了一个。

The most elegant thing to do, IMHO, would be to extend Worksheetor a subclass with a delete_rowsmethod. I would implement such a method by changing the coordinates of its Cells in place. But this could break if openpyxlinternals change.

恕我直言,最优雅的做法是Worksheet使用delete_rows方法扩展或子类。我将通过更改其Cells的坐标来实现这样的方法。但是,如果openpyxl内部结构发生变化,这可能会中断。

回答by jhughs

2018 update: I was searching how to delete a row today and found that the functionality is added in openpyxl 2.5.0-b2. Just tried and it worked perfectly. Here's the link where I found the answer: https://bitbucket.org/openpyxl/openpyxl/issues/964/delete_rows-does-not-work-on-deleting

2018更新:今天搜索如何删除一行,发现openpyxl 2.5.0-b2中增加了该功能。刚刚尝试过,效果很好。这是我找到答案的链接:https: //bitbucket.org/openpyxl/openpyxl/issues/964/delete_rows-does-not-work-on-deleting

And here's the syntax to delete one row:

这是删除一行的语法:

ws.delete_rows(index, 1)

where: 'ws' is the worksheet, 'index' is the row number, and '1' is the number of rows to delete.

其中:“ws”是工作表,“index”是行号,“1”是要删除的行数。

There's also the ability to delete columns, but I haven't tried that.

还有删除列的功能,但我还没有尝试过。