Python 使用 pandas.DataFrame.to_csv() 按列输出不同的精度?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20003290/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Output different precision by column with pandas.DataFrame.to_csv()?
提问by ryanjdillon
Question
题
Is it possible to specify a float precision specifically for each column to be printed by the Python pandaspackage method pandas.DataFrame.to_csv?
是否可以专门为 Pythonpandas包方法pandas.DataFrame.to_csv打印的每一列指定浮点精度?
Background
背景
If I have a pandasdataframe that is arranged like this:
如果我有一个pandas这样排列的数据框:
In [53]: df_data[:5]
Out[53]:
year month day lats lons vals
0 2012 6 16 81.862745 -29.834254 0.0
1 2012 6 16 81.862745 -29.502762 0.1
2 2012 6 16 81.862745 -29.171271 0.0
3 2012 6 16 81.862745 -28.839779 0.2
4 2012 6 16 81.862745 -28.508287 0.0
There is the float_formatoption that can be used to specify a precision, but this applys that precision to all columns of the dataframe when printed.
有一个float_format选项可用于指定精度,但这会在打印时将该精度应用于数据帧的所有列。
When I use that like so:
当我像这样使用它时:
df_data.to_csv(outfile, index=False,
header=False, float_format='%11.6f')
I get the following, where valsis given an inaccurate precision:
我得到以下信息,其中vals给出的精度不准确:
2012,6,16, 81.862745, -29.834254, 0.000000
2012,6,16, 81.862745, -29.502762, 0.100000
2012,6,16, 81.862745, -29.171270, 0.000000
2012,6,16, 81.862745, -28.839779, 0.200000
2012,6,16, 81.862745, -28.508287, 0.000000
采纳答案by hknust
Change the type of column "vals" prior to exporting the data frame to a CSV file
在将数据框导出到 CSV 文件之前更改列“vals”的类型
df_data['vals'] = df_data['vals'].map(lambda x: '%2.1f' % x)
df_data.to_csv(outfile, index=False, header=False, float_format='%11.6f')
回答by mattexx
You can do this with to_string. There is a formattersargument where you can provide a dict of columns names to formatters. Then you can use some regexp to replace the default column separators with your delimiter of choice.
您可以使用to_string. 有一个formatters参数,您可以在其中向格式化程序提供列名称的字典。然后您可以使用一些正则表达式用您选择的分隔符替换默认的列分隔符。
回答by Michael Szczepaniak
The more current version of hknust's first line would be:
hknust 第一行的最新版本是:
df_data['vals'] = df_data['vals'].map(lambda x: '{0:.1}'.format(x))
To print without scientific notation:
不使用科学记数法打印:
df_data['vals'] = df_data['vals'].map(lambda x: '{0:.1f}'.format(x))
回答by nealmcb
The to_stringapproach suggested by @mattexx looks better to me, since it doesn't modify the dataframe.
to_string@mattexx 建议的方法对我来说看起来更好,因为它不会修改数据框。
It also generalizes well when using jupyternotebooks to get pretty HTML output, via the to_htmlmethod. Here we set a new default precision of 4, and override it to get 5 digits for a particular column wider:
当使用jupyter笔记本通过to_html方法获得漂亮的 HTML 输出时,它也能很好地概括。在这里,我们将新的默认精度设置为 4,并覆盖它以获得特定列的 5 位数字wider:
from IPython.display import HTML
from IPython.display import display
pd.set_option('precision', 4)
display(HTML(df.to_html(formatters={'wider': '{:,.5f}'.format})))

