Pandas 数据透视表百分比计算

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37148787/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:12:43  来源:igfitidea点击:

Pandas pivot table Percent Calculations

python-3.xpandaspivot-tablepercentage

提问by Dance Party2

Given the following data frame and pivot table:

给定以下数据框和数据透视表:

import pandas as pd
df=pd.DataFrame({'A':['x','y','z','x','y','z'],
                 'B':['one','one','one','two','two','two'],
                 'C':[2,18,2,8,2,18]})
df

    A   B       C
0   x   one     2
1   y   one     18
2   z   one     2
3   x   two     8
4   y   two     2
5   z   two     18

table = pd.pivot_table(df, index=['A', 'B'],aggfunc=np.sum)

            C
A   B   
x   one     2
    two     8
y   one     18
    two     2
z   one     2
    two     18

I'd like to add 2 columns to this pivot table; one showing the percent of all values and another for percent within column A like this:

我想在这个数据透视表中添加 2 列;一个显示所有值的百分比,另一个显示 A 列中的百分比,如下所示:

           C    % of Total  % of B
A   B
x   one    2    4%          10%
    two   18    36%         90%
y   one    2    4%          20%
    two    8    16%         80%
z   one    2    4%          10%
    two   18    36%         90%

Extra Credit:

额外学分:

I'd like a bottom summary row which has the sum of column C (it's okay if it also has 100% for the next 2 columns, but nothing is needed for those).

我想要一个底部汇总行,它包含 C 列的总和(如果接下来的 2 列也有 100% 也可以,但那些不需要任何内容​​)。

回答by jezrael

You can use:

您可以使用:

table['% of Total'] = (table.C / table.C.sum() * 100).astype(str) + '%'
table['% of B'] = (table.C / table.groupby(level=0).C.transform(sum) * 100).astype(str) + '%'
print table
        C % of Total % of B
A B                        
x one   2       4.0%  20.0%
  two   8      16.0%  80.0%
y one  18      36.0%  90.0%
  two   2       4.0%  10.0%
z one   2       4.0%  10.0%
  two  18      36.0%  90.0%

But with real data I think casting to intis not recommended, better is use round.

但是对于真实数据,我认为int不推荐强制转换为,更好的是使用round.

Extra Credit:

额外学分:

table['% of Total'] = (table.C / table.C.sum() * 100)
table['% of B'] = (table.C / table.groupby(level=0).C.transform(sum) * 100)
table.loc['total', :] = table.sum().values
print table
              C  % of Total  % of B
A     B                            
x     one   2.0         4.0    20.0
      two   8.0        16.0    80.0
y     one  18.0        36.0    90.0
      two   2.0         4.0    10.0
z     one   2.0         4.0    10.0
      two  18.0        36.0    90.0
total      50.0       100.0   300.0