Python 使用不同列的不同格式将 Pandas DataFrame 写入 Excel

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/29974672/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 05:14:36  来源:igfitidea点击:

Writing pandas DataFrame to Excel with different formats for different columns

pythonexcelpandasopenpyxl

提问by sparc_spread

I am trying to write a pandas DataFrameto an .xlsxfile where different numerical columns would have different formats. For example, some would show only two decimal places, some would show none, some would be formatted as percents with a "%" symbol, etc.

我试图写一个大熊猫DataFrame到一个.xlsx文件,其中不同的数值列将有不同的格式。例如,有些将只显示两位小数,有些将不显示,有些将格式化为带有“%”符号的百分比等。

I noticed that DataFrame.to_html()has a formattersparameter that allows one to do just that, mapping different formats to different columns. However, there is no similar parameter on the DataFrame.to_excel()method. The most we have is a float_formatthat is global to all numbers.

我注意到它DataFrame.to_html()有一个formatters参数可以让人们做到这一点,将不同的格式映射到不同的列。但是,该DataFrame.to_excel()方法没有类似的参数。我们拥有的最多的是一个float_format对所有数字都是全局的。

I have read many SO posts that are at least partly related to my question, for example:

我已经阅读了许多至少与我的问题部分相关的 SO 帖子,例如:

Are there other more convenient Excel-related functions/properties in the pandas API that can help here, or something similar on openpyxl, or perhaps some way to specify output format metadata directly onto each column in the DataFramethat would then be interpreted downstream by different outputters?

Pandas API 中是否还有其他更方便的 Excel 相关函数/属性可以在此处提供帮助,或者类似的东西openpyxl,或者某种方式将输出格式元数据直接指定到 中的每一列,DataFrame然后由不同的输出器下游解释?

采纳答案by jmcnamara

You can do this with Pandas 0.16 and the XlsxWriter engine by accessing the underlying workbook and worksheet objects:

您可以使用 Pandas 0.16 和 XlsxWriter 引擎通过访问底层工作簿和工作表对象来做到这一点:

import pandas as pd

# Create a Pandas dataframe from some data.
df = pd.DataFrame(zip(
    [1010, 2020, 3030, 2020, 1515, 3030, 4545],
    [.1, .2, .33, .25, .5, .75, .45],
    [.1, .2, .33, .25, .5, .75, .45],
))

# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('test.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')

# Get the xlsxwriter objects from the dataframe writer object.
workbook  = writer.book
worksheet = writer.sheets['Sheet1']

# Add some cell formats.
format1 = workbook.add_format({'num_format': '#,##0.00'})
format2 = workbook.add_format({'num_format': '0%'})
format3 = workbook.add_format({'num_format': 'h:mm:ss AM/PM'})

# Set the column width and format.
worksheet.set_column('B:B', 18, format1)

# Set the format but not the column width.
worksheet.set_column('C:C', None, format2)

worksheet.set_column('D:D', 16, format3)

# Close the Pandas Excel writer and output the Excel file.
writer.save()

Output:

输出:

enter image description here

在此处输入图片说明

See also Working with Python Pandas and XlsxWriter.

另请参阅使用 Python Pandas 和 XlsxWriter

回答by Charlie Clark

As you rightly point out applying formats to individual cells is extremely inefficient.

正如您正确指出的那样,将格式应用于单个单元格效率极低。

openpyxl 2.4 includes native support for Pandas Dataframes and named styles.

openpyxl 2.4 包括对 Pandas Dataframes 和命名样式的原生支持。

https://openpyxl.readthedocs.io/en/latest/changes.html#id7

https://openpyxl.readthedocs.io/en/latest/changes.html#id7