pandas 熊猫将列转换为总数的百分比

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42006346/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:54:05  来源:igfitidea点击:

pandas convert columns to percentages of the totals

pythonpandas

提问by DTATSO

I have a dataframe with 4 columns an ID and three categories that results fell into

我有一个包含 4 列 ID 和三个结果类别的数据框

  <80% 80-90 >90
id
1   2     4    4
2   3     6    1
3   7     0    3

I would like to convert it to percentages ie:

我想将其转换为百分比,即:

   <80% 80-90 >90
id
1   20%   40%  40%
2   30%   60%  10%
3   70%    0%  30%

this seems like it should be within pandas capabilities but I just can't figure it out.

这似乎应该在Pandas的能力范围内,但我无法弄清楚。

Thanks in advance!

提前致谢!

回答by ASGM

You can do this using basic pandas operators .divand .sum, using the axisargument to make sure the calculations happen the way you want:

您可以使用基本的 Pandas 运算符.div和来执行此操作,并.sum使用axis参数确保计算按您想要的方式进行:

cols = ['<80%', '80-90', '>90']
df[cols] = df[cols].div(df[cols].sum(axis=1), axis=0).multiply(100)
  • Calculate the sum of each column (df[cols].sum(axis=1). axis=1makes the summation occur across the rows, rather than down the columns.
  • Divide the dataframe by the resulting series (df[cols].div(df[cols].sum(axis=1), axis=0). axis=0makes the division happen across the columns.
  • To finish, multiply the results by 100so they are percentages between 0 and 100 instead of proportions between 0 and 1 (or you can skip this step and store them as proportions).
  • 计算每列的总和 ( df[cols].sum(axis=1)。 axis=1使求和发生在行中,而不是在列中。
  • 将数据帧除以结果系列 ( df[cols].div(df[cols].sum(axis=1), axis=0)。 axis=0使划分发生在列之间。
  • 最后,将结果乘以1000 到 100 之间的百分比,而不是 0 到 1 之间的比例(或者您可以跳过此步骤并将它们存储为比例)。

回答by Tim Tian

df/df.sum()

If you want to divide the sum of rows, transpose it first.

如果要除以行的总和,请先将其转置。

回答by FDV

Tim Tian's answer pretty much worked for me, but maybe this helps if you have a df with several columns and want to do a % column wise.

Tim Tian 的回答对我来说非常有用,但是如果您有一个包含多个列的 df 并且想要明智地执行 % 列,这可能会有所帮助。

df_pct = df/df[df.columns].sum()*100

I was having trouble because I wanted to have the result of a pd.pivot_table expressed as a %, but couldn't get it to work. So I just used that code on the resulting table itself and it worked.

我遇到了麻烦,因为我想将 pd.pivot_table 的结果表示为 %,但无法使其正常工作。所以我只是在结果表本身上使用了该代码并且它起作用了。