Python Pandas - 在 Groupby DF 上将列转换为百分比

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45591918/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:13:15  来源:igfitidea点击:

Python Pandas - Convert column to percentage on Groupby DF

pythonpandas

提问by ScoutEU

I have a dataframe that I created by a groupby:

我有一个由 groupby 创建的数据框:

hmdf = pd.DataFrame(hm01)
new_hm01 = hmdf[['FinancialYear','Month','FirstReceivedDate']]

hm05 = new_hm01.pivot_table(index=['FinancialYear','Month'], aggfunc='count')
vals1 = ['April    ', 'May      ', 'June     ', 'July     ', 'August   ', 'September', 'October  ', 'November ', 'December ', 'January  ', 'February ', 'March    ']

df_hm = new_hm01.groupby(['Month', 'FinancialYear']).size().unstack(fill_value=0).rename(columns=lambda x: '{}'.format(x))
df_hml = df_hm.reindex(vals1)

The DF looks like this:

DF 看起来像这样:

FinancialYear   2014/2015   2015/2016   2016/2017   2017/2018
Month               
April               34          24          22          20
May                 29          26          21          25
June                19          39          22          20
July                23          39          18          20
August              36          30          34           0
September           35          23          41           0
October             36          37          27           0
November            38          31          30           0
December            36          41          23           0
January             34          30          35           0
February            37          26          37           0
March               36          31          33           0

The column names are from variables (threeYr,twoYr,oneYr,Yr), and I want to convert the dataframe so that the numbers are percentages of the total for each column, but I cant get it to work.

列名来自 variables (threeYr,twoYr,oneYr,Yr),我想转换数据框,以便数字是每列总数的百分比,但我无法让它工作。

This is what I want:

这就是我要的:

FinancialYear       2014/2015   2015/2016   2016/2017   2017/2018
Month               
April                   9%          6%          6%         24%
May                     7%          7%          6%         29%
June                    5%         10%          6%         24%
July                    6%         10%          5%         24%
August                  9%          8%         10%          0%
September               9%          6%         12%          0%
October                 9%         10%          8%          0%
November               10%          8%          9%          0%
December                9%         11%          7%          0%
January                 9%          8%         10%          0%
February                9%          7%         11%          0%
March                   9%          8%         10%          0%

Could anyone help me with doing this?

有人可以帮我做这件事吗?

Edit: I tried the response found at this link: pandas convert columns to percentages of the totals..... I could not get that to work for my dataframe + it does not explain well (to me) how to make it work for any DF. The response from John Galt I believe is better than that response (my opinion).

编辑:我尝试了在此链接中找到的响应:pandas 将列转换为总数的百分比..... 我无法让它适用于我的数据框 + 它没有很好地解释(对我而言)如何使其工作任何 DF。我相信 John Galt 的回应比那个回应要好(我的意见)。

回答by Zero

Here's one way

这是一种方法

In [1371]: (100. * df / df.sum()).round(0)
Out[1371]:
               2014/2015  2015/2016  2016/2017  2017/2018
FinancialYear
April                9.0        6.0        6.0       24.0
May                  7.0        7.0        6.0       29.0
June                 5.0       10.0        6.0       24.0
July                 6.0       10.0        5.0       24.0
August               9.0        8.0       10.0        0.0
September            9.0        6.0       12.0        0.0
October              9.0       10.0        8.0        0.0
November            10.0        8.0        9.0        0.0
December             9.0       11.0        7.0        0.0
January              9.0        8.0       10.0        0.0
February             9.0        7.0       11.0        0.0
March                9.0        8.0       10.0        0.0

And, if you want to rounded to 1 decimal place with value as strings with '%'

而且,如果你想四舍五入到小数点后 1 位,值作为带有 '%' 的字符串

In [1375]: (100. * df / df.sum()).round(1).astype(str) + '%'
Out[1375]:
              2014/2015 2015/2016 2016/2017 2017/2018
FinancialYear
April              8.7%      6.4%      6.4%     23.5%
May                7.4%      6.9%      6.1%     29.4%
June               4.8%     10.3%      6.4%     23.5%
July               5.9%     10.3%      5.2%     23.5%
August             9.2%      8.0%      9.9%      0.0%
September          8.9%      6.1%     12.0%      0.0%
October            9.2%      9.8%      7.9%      0.0%
November           9.7%      8.2%      8.7%      0.0%
December           9.2%     10.9%      6.7%      0.0%
January            8.7%      8.0%     10.2%      0.0%
February           9.4%      6.9%     10.8%      0.0%
March              9.2%      8.2%      9.6%      0.0%