Pandas 中的数据透视表小计
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/41383302/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pivot table subtotals in Pandas
提问by Tony
I have the following data:
我有以下数据:
Employee Account Currency Amount Location
Test 2 Basic USD 3000 Airport
Test 2 Net USD 2000 Airport
Test 1 Basic USD 4000 Town
Test 1 Net USD 3000 Town
Test 3 Basic GBP 5000 Town
Test 3 Net GBP 4000 Town
I can manage to pivot by doing the following:
我可以通过执行以下操作来实现旋转:
import pandas as pd
table = pd.pivot_table(df, values=['Amount'], index=['Location', 'Employee'], columns=['Account', 'Currency'], fill_value=0, aggfunc=np.sum, dropna=True)
Output:
输出:
Amount
Account Basic Net
Currency GBP USD GBP USD
Location Employee
Airport Test 2 0 3000 0 2000
Town Test 1 0 4000 0 3000
Test 3 5000 0 4000 0
How can I achieve subtotal by location and then a final grand total at the bottom. Desired output:
如何按位置实现小计,然后在底部实现最终总计。期望的输出:
Amount
Account Basic Net
Currency GBP USD GBP USD
Location Employee
Airport Test 2 0 3000 0 2000
Airport Total 3000 0 2000
Town Test 1 0 4000 0 3000
Test 3 5000 0 4000 0
Town Total 5000 4000 4000 3000
Grand Total 5000 7000 4000 5000
I tried following the following. But it does not give the desired output. Thank you.
我尝试遵循以下。但它没有给出所需的输出。谢谢你。
回答by piRSquared
your pivot table
你的数据透视表
table = pd.pivot_table(df, values=['Amount'],
index=['Location', 'Employee'],
columns=['Account', 'Currency'],
fill_value=0, aggfunc=np.sum, dropna=True, )
print(table)
Amount
Account Basic Net
Currency GBP USD GBP USD
Location Employee
Airport Test 2 0 3000 0 2000
Town Test 1 0 4000 0 3000
Test 3 5000 0 4000 0
pandas.concat
pandas.concat
pd.concat([
d.append(d.sum().rename((k, 'Total')))
for k, d in table.groupby(level=0)
]).append(table.sum().rename(('Grand', 'Total')))
Amount
Account Basic Net
Currency GBP USD GBP USD
Location Employee
Airport 2 0 3000 0 2000
Total 0 3000 0 2000
Town 1 0 4000 0 3000
3 5000 0 4000 0
Total 5000 4000 4000 3000
Grand Total 5000 7000 4000 5000
Old Answer
旧答案
for posterity
为后人
build sub totals
建立小计
tab_tots = table.groupby(level='Location').sum()
tab_tots.index = [tab_tots.index, ['Total'] * len(tab_tots)]
print(tab_tots)
Amount
Account Basic Net
Currency GBP USD GBP USD
Location
Airport Total 0 3000 0 2000
Town Total 5000 4000 4000 3000
all together
全部一起
pd.concat(
[table, tab_tots]
).sort_index().append(
table.sum().rename(('Grand', 'Total'))
)
回答by lmo
Here is a two-liner that should work. The loc
method allows for subsetting the rows by their indexes, since there is a multiIndex, I feed loc
a tuple for the row insertion points on the left hand side. Using 'Town' without the tuple, pulls all the corresponding levels of the index.
这是一个应该工作的两行。该loc
方法允许通过它们的索引对行进行子集,因为有一个 multiIndex,我loc
为左侧的行插入点提供了一个元组。使用不带元组的 'Town',拉取索引的所有相应级别。
In the second line, I have to drop the last row of the DataFrame from the sum
and I do this using its shape attribute.
在第二行中,我必须从 中删除 DataFrame 的最后一行,sum
并使用其 shape 属性执行此操作。
In[1]:
table.loc[('Town Total', ''),:] = table.loc['Town'].sum()
table.loc[('Grand Total', ''),:] = table.iloc[:(table.shape[0]-1), :].sum()
In[2]:
table
Out[2]:
Amount
Account Basic Net
Currency GBP USD GBP USD
Location Employee
Airport 2 0 3000 0 2000
Town 1 0 4000 0 3000
3 5000 0 4000 0
Town Total 5000 4000 4000 3000
Grand Total 5000 7000 4000 5000