Python 使用不同列的不同格式将 Pandas DataFrame 写入 Excel

Question

提问by sparc_spread

I am trying to write a pandas DataFrameto an .xlsxfile where different numerical columns would have different formats. For example, some would show only two decimal places, some would show none, some would be formatted as percents with a "%" symbol, etc.

我试图写一个大熊猫DataFrame到一个.xlsx文件，其中不同的数值列将有不同的格式。例如，有些将只显示两位小数，有些将不显示，有些将格式化为带有“%”符号的百分比等。

I noticed that DataFrame.to_html()has a formattersparameter that allows one to do just that, mapping different formats to different columns. However, there is no similar parameter on the DataFrame.to_excel()method. The most we have is a float_formatthat is global to all numbers.

我注意到它DataFrame.to_html()有一个formatters参数可以让人们做到这一点，将不同的格式映射到不同的列。但是，该DataFrame.to_excel()方法没有类似的参数。我们拥有的最多的是一个float_format对所有数字都是全局的。

I have read many SO posts that are at least partly related to my question, for example:

我已经阅读了许多至少与我的问题部分相关的 SO 帖子，例如：

Use the older openpyxlengine to apply formats one cell at a time. This is the approach with which I've had the most success. But it means writing loops to apply formats cell-by-cell, remembering offsets, etc.
Render percentages by changing the table data itself into strings. Going the route of altering the actual data inspired me to try dealing with decimal place formatting by calling round()on each column before writing to Excel - this works too, but I'd like to avoid altering the data.
Assorted others, mostly about date formats

使用旧openpyxl引擎一次应用格式一个单元格。这是我最成功的方法。但这意味着编写循环以逐个单元地应用格式，记住偏移量等。
通过将表数据本身更改为字符串来呈现百分比。改变实际数据的路线启发我尝试通过round()在写入 Excel 之前调用每一列来处理小数位格式- 这也有效，但我想避免更改数据。
各种其他的，主要是关于日期格式

Are there other more convenient Excel-related functions/properties in the pandas API that can help here, or something similar on openpyxl, or perhaps some way to specify output format metadata directly onto each column in the DataFramethat would then be interpreted downstream by different outputters?

Pandas API 中是否还有其他更方便的 Excel 相关函数/属性可以在此处提供帮助，或者类似的东西openpyxl，或者某种方式将输出格式元数据直接指定到中的每一列，DataFrame然后由不同的输出器下游解释？

Answer 1

采纳答案by jmcnamara

You can do this with Pandas 0.16 and the XlsxWriter engine by accessing the underlying workbook and worksheet objects:

您可以使用 Pandas 0.16 和 XlsxWriter 引擎通过访问底层工作簿和工作表对象来做到这一点：

import pandas as pd

# Create a Pandas dataframe from some data.
df = pd.DataFrame(zip(
    [1010, 2020, 3030, 2020, 1515, 3030, 4545],
    [.1, .2, .33, .25, .5, .75, .45],
    [.1, .2, .33, .25, .5, .75, .45],
))

# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('test.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')

# Get the xlsxwriter objects from the dataframe writer object.
workbook  = writer.book
worksheet = writer.sheets['Sheet1']

# Add some cell formats.
format1 = workbook.add_format({'num_format': '#,##0.00'})
format2 = workbook.add_format({'num_format': '0%'})
format3 = workbook.add_format({'num_format': 'h:mm:ss AM/PM'})

# Set the column width and format.
worksheet.set_column('B:B', 18, format1)

# Set the format but not the column width.
worksheet.set_column('C:C', None, format2)

worksheet.set_column('D:D', 16, format3)

# Close the Pandas Excel writer and output the Excel file.
writer.save()

Output:

输出：

enter image description here

在此处输入图片说明

See also Working with Python Pandas and XlsxWriter.

另请参阅使用 Python Pandas 和 XlsxWriter。

Answer 2

回答by Charlie Clark

As you rightly point out applying formats to individual cells is extremely inefficient.

正如您正确指出的那样，将格式应用于单个单元格效率极低。

openpyxl 2.4 includes native support for Pandas Dataframes and named styles.

openpyxl 2.4 包括对 Pandas Dataframes 和命名样式的原生支持。

https://openpyxl.readthedocs.io/en/latest/changes.html#id7

Python 使用不同列的不同格式将 Pandas DataFrame 写入 Excel

提问by sparc_spread

采纳答案by jmcnamara

回答by Charlie Clark

相关推荐

最近更新

标签

Python 使用不同列的不同格式将 Pandas DataFrame 写入 Excel

提问by sparc_spread

采纳答案by jmcnamara

回答by Charlie Clark

相关推荐

Python 为什么我收到无效的语法错误？

在 Python 中计算累积分布函数 (CDF)

Python django.db.utils.OperationalError 无法连接到服务器

Python 使用 pip 安装 pyyaml/将 PyYaml 添加为 pip 依赖项

相关推荐

最近更新

标签