Python 将某些浮动数据框列格式化为熊猫中的百分比

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/23981601/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 03:46:22  来源:igfitidea点击:

Format certain floating dataframe columns into percentage in pandas

pythonpandasformattingipython-notebook

提问by user3576212

I am trying to write a paper in IPython notebook, but encountered some issues with display format. Say I have following dataframe df, is there any way to format var1and var2into 2 digit decimals and var3into percentages.

我正在尝试在 IPython notebook 中写一篇论文,但遇到了一些显示格式问题。说我有以下的数据帧df,有没有什么办法格式var1var2成2位小数和var3成比例。

       var1        var2         var3    
id                                              
0    1.458315    1.500092   -0.005709   
1    1.576704    1.608445   -0.005122    
2    1.629253    1.652577   -0.004754    
3    1.669331    1.685456   -0.003525   
4    1.705139    1.712096   -0.003134   
5    1.740447    1.741961   -0.001223   
6    1.775980    1.770801   -0.001723    
7    1.812037    1.799327   -0.002013    
8    1.853130    1.822982   -0.001396    
9    1.943985    1.868401    0.005732

The numbers inside are not multiplied by 100, e.g. -0.0057=-0.57%.

里面的数字没有乘以100,例如-0.0057=-0.57%。

采纳答案by Woody Pride

replace the values using the round function, and format the string representation of the percentage numbers:

使用 round 函数替换值,并格式化百分比数字的字符串表示形式:

df['var2'] = pd.Series([round(val, 2) for val in df['var2']], index = df.index)
df['var3'] = pd.Series(["{0:.2f}%".format(val * 100) for val in df['var3']], index = df.index)

The round function rounds a floating point number to the number of decimal places provided as second argument to the function.

round 函数将浮点数四舍五入到作为函数第二个参数提供的小数位数。

String formatting allows you to represent the numbers as you wish. You can change the number of decimal places shown by changing the number before the f.

字符串格式允许您根据需要表示数字。您可以通过更改f.之前的数字来更改显示的小数位数。

p.s. I was not sure if your 'percentage' numbers had already been multiplied by 100. If they have then clearly you will want to change the number of decimals displayed, and remove the hundred multiplication.

ps 我不确定您的“百分比”数字是否已经乘以 100。如果已经乘以 100,那么您将想要更改显示的小数位数,并删除百倍乘法。

回答by Romain Jouin

You could also set the default format for float :

您还可以为 float 设置默认格式:

pd.options.display.float_format = '{:.2f}%'.format

回答by linqu

The accepted answer suggests to modify the raw data for presentation purposes, something you generally do not want. Imagine you need to make further analyses with these columns and you need the precision you lost with rounding.

接受的答案建议修改原始数据以用于演示目的,这是您通常不想要的。想象一下,您需要对这些列进行进一步分析,并且需要舍入时丢失的精度。

You can modify the formatting of individual columns in data frames, in your case:

您可以修改数据框中各个列的格式,在您的情况下:

output = df.to_string(formatters={
    'var1': '{:,.2f}'.format,
    'var2': '{:,.2f}'.format,
    'var3': '{:,.2%}'.format
})
print(output)

For your information '{:,.2%}'.format(0.214)yields 21.40%, so no need for multiplying by 100.

对于您的信息'{:,.2%}'.format(0.214)yield 21.40%,因此无需乘以 100。

You don't have a nice HTML table anymore but a text representation. If you need to stay with HTML use the to_htmlfunction instead.

你不再有一个漂亮的 HTML 表格,而是一个文本表示。如果您需要继续使用 HTML,请改用该to_html函数。

from IPython.core.display import display, HTML
output = df.to_html(formatters={
    'var1': '{:,.2f}'.format,
    'var2': '{:,.2f}'.format,
    'var3': '{:,.2%}'.format
})
display(HTML(output))

Update

更新

As of pandas 0.17.1, life got easier and we can get a beautiful html table right away:

从 pandas 0.17.1 开始,生活变得更轻松,我们可以立即获得一个漂亮的 html 表格:

df.style.format({
    'var1': '{:,.2f}'.format,
    'var2': '{:,.2f}'.format,
    'var3': '{:,.2%}'.format,
})

回答by mdeff

As suggested by @linqu you should not change your data for presentation. Since pandas 0.17.1, (conditional) formatting was made easier. Quoting the documentation:

正如@linqu 所建议的,您不应更改用于演示的数据。从 pandas 0.17.1 开始,(条件)格式化变得更容易了。引用文档

You can apply conditional formatting, the visual styling of a DataFramedepending on the data within, by using the DataFrame.styleproperty. This is a property that returns a pandas.Stylerobject, which has useful methods for formatting and displaying DataFrames.

您可以通过使用属性应用条件格式DataFrame根据其中的数据应用a 的视觉样式DataFrame.style。这是一个返回pandas.Styler对象的属性,该对象具有用于格式化和显示 的有用方法DataFrames

For your example, that would be (the usual table will show up in Jupyter):

对于您的示例,这将是(通常的表将显示在 Jupyter 中):

df.style.format({
    'var1': '{:,.2f}'.format,
    'var2': '{:,.2f}'.format,
    'var3': '{:,.2%}'.format,
})

回答by circld

As a similar approach to the accepted answer that might be considered a bit more readable, elegant, and general (YMMV), you can leverage the mapmethod:

作为与公认答案类似的方法,可能被认为更具可读性、优雅和通用性 (YMMV),您可以利用该map方法:

# OP example
df['var3'].map(lambda n: '{:,.2%}'.format(n))

# also works on a series
series_example.map(lambda n: '{:,.2%}'.format(n))

Performance-wise, this is pretty close (marginally slower) than the OP solution.

在性能方面,这与 OP 解决方案非常接近(略慢)。

As an aside, if you do choose to go the pd.options.display.float_formatroute, consider using a context manager to handle state per this parallel numpy example.

顺便说pd.options.display.float_format一句,如果您确实选择走这条路线,请考虑使用上下文管理器来处理每个并行 numpy 示例的状态

回答by RK1

Just another way of doing it should you require to do it over a larger range ofcolumns

如果您需要在更大范围的上执行此操作,这是另一种方法

using applymap

使用应用映射

df[['var1','var2']] = df[['var1','var2']].applymap("{0:.2f}".format)
df['var3'] = df['var3'].applymap(lambda x: "{0:.2f}%".format(x*100))

applymap is useful if you need to apply the function over multiple columns; it's essentially an abbreviation of the below for this specific example:

如果您需要将函数应用于多列,applymap 很有用;对于此特定示例,它本质上是以下内容的缩写:

df[['var1','var2']].apply(lambda x: map(lambda x:'{:.2f}%'.format(x),x),axis=1)

Great explanation below of apply, map applymap:

下面对apply、mapapplymap的很好的解释:

Difference between map, applymap and apply methods in Pandas

Pandas 中 map、applymap 和 apply 方法的区别

回答by Poudel

Often times we are interested in calculating the full significant digits, but for the visual aesthetics, we may want to see only few decimal point when we display the dataframe.

很多时候我们对计算完整的有效数字感兴趣,但为了视觉美感,当我们显示数据框时,我们可能只想看到几个小数点。

In jupyter-notebook, pandas can utilize the html formatting taking advantage of the method called style.

在 jupyter-notebook 中,pandas 可以利用名为style.

For the case of just seeing two significant digits of some columns, we can use this code snippet:

对于只看到某些列的两位有效数字的情况,我们可以使用以下代码片段:

Given dataframe

给定数据框

import numpy as np
import pandas as pd

df = pd.DataFrame({'var1': [1.458315, 1.576704, 1.629253, 1.6693310000000001, 1.705139, 1.740447, 1.77598, 1.812037, 1.85313, 1.9439849999999999],
          'var2': [1.500092, 1.6084450000000001, 1.652577, 1.685456, 1.7120959999999998, 1.741961, 1.7708009999999998, 1.7993270000000001, 1.8229819999999999, 1.8684009999999998],
          'var3': [-0.0057090000000000005, -0.005122, -0.0047539999999999995, -0.003525, -0.003134, -0.0012230000000000001, -0.0017230000000000001, -0.002013, -0.001396, 0.005732]})

print(df)
       var1      var2      var3
0  1.458315  1.500092 -0.005709
1  1.576704  1.608445 -0.005122
2  1.629253  1.652577 -0.004754
3  1.669331  1.685456 -0.003525
4  1.705139  1.712096 -0.003134
5  1.740447  1.741961 -0.001223
6  1.775980  1.770801 -0.001723
7  1.812037  1.799327 -0.002013
8  1.853130  1.822982 -0.001396
9  1.943985  1.868401  0.005732

Style to get required format

样式以获得所需的格式

    df.style.format({'var1': "{:.2f}",'var2': "{:.2f}",'var3': "{:.2%}"})

Gives:

给出:

var1    var2    var3
id          
0   1.46    1.50    -0.57%
1   1.58    1.61    -0.51%
2   1.63    1.65    -0.48%
3   1.67    1.69    -0.35%
4   1.71    1.71    -0.31%
5   1.74    1.74    -0.12%
6   1.78    1.77    -0.17%
7   1.81    1.80    -0.20%
8   1.85    1.82    -0.14%
9   1.94    1.87    0.57%

Update

更新

If display command is not found try following:

如果未找到显示命令,请尝试以下操作:

from IPython.display import display

df_style = df.style.format({'var1': "{:.2f}",'var2': "{:.2f}",'var3': "{:.2%}"}))

display(df_style)

Requirements

要求

  • To use displaycommand, you need to have installed Ipython in your machine.
  • The displaycommand does not work in online python interpreter which do not have IPytoninstalled such as https://repl.it/languages/python3
  • The display command works in jupyter-notebook, jupyter-lab, Google-colab, kaggle-kernels, IBM-watson,Mode-Analytics and many other platforms out of the box, you do not even have to import display from IPython.display
  • 要使用display命令,您需要在您的机器上安装 Ipython。
  • display命令在未IPyton安装的在线 python 解释器中不起作用,例如https://repl.it/languages/python3
  • display 命令适用于 jupyter-notebook、jupyter-lab、Google-colab、kaggle-kernels、IBM-watson、Mode-Analytics 和许多其他开箱即用的平台,您甚至不必从 IPython.display 导入 display