使用 Pandas 将列添加到数据透视表
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/46220167/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Add columns to pivot table with pandas
提问by Franklin Januário
I have the table as follow:
我有如下表格:
import pandas as pd
import numpy as np
#simple table
fazenda = [6010,6010,6010,6010]
quadra = [1,1,2,2]
talhao = [1,2,3,4]
arTotal = [32.12,33.13,34.14,35.15]
arCarr = [i/2 for i in arTotal]
arProd = [i/2 for i in arTotal]
varCan = ['RB1','RB2','RB3','RB4']
data = list(zip(fazenda,quadra,talhao,arTotal,arCarr,arProd,varCan))
#Pandas DataFrame
df = pd.DataFrame(data=data,columns=['Fazenda','Quadra','Talhao','ArTotal','ArCarr','ArProd','Variedade'])
#Pivot Table
table = pd.pivot_table(df, values=['ArTotal','ArCarr','ArProd'],index=['Quadra','Talhao'], fill_value=0)
print(table)
resulting in this:
导致这个:
ArCarr ArProd ArTotal
Quadra Talhao
1 1 16.060 16.060 32.12
2 16.565 16.565 33.13
2 3 17.070 17.070 34.14
4 17.575 17.575 35.15
I need two aditional steps:
我需要两个额外的步骤:
- Add the Subtotal and Grand Total for 'ArTotal', 'ArCarr' e 'ArProd' fields
- Add 'Variedade' field to table
- 为“ArTotal”、“ArCarr”和“ArProd”字段添加小计和总计
- 将“Variedade”字段添加到表
I tried to add the column but the result was incorrect. Following some links about Total and Grand Total, I did not get the satisfactory result.
我尝试添加该列,但结果不正确。按照有关 Total 和 Grand Total 的一些链接,我没有得到满意的结果。
I'm having a hard time understanding pandas, I ask for help from more experienced colleagues.
我很难理解Pandas,我向更有经验的同事寻求帮助。
采纳答案by Zero
Get the pivot
right first.
先找pivot
对。
In [404]: values = ['ArTotal','ArCarr','ArProd']
In [405]: table = pd.pivot_table(df, values=values, index=['Quadra','Talhao','Variedade'],
fill_value=0).reset_index(level=-1)
Get Grand totals
获取总计
In [406]: Gt = table[values].sum()
Get Quadra
level totals
获取Quadra
级别总数
In [407]: St = table.sum(level='Quadra')
Using append
reshape the table
使用append
重塑table
In [408]: (table.append(
St.assign(Talhao='Total').set_index('Talhao', append=True)
).sort_index()
.append(pd.DataFrame([Gt.values], columns=Gt.index,
index=pd.MultiIndex.from_tuples([('Grand Total', '')],
names=['Quadra', 'Talhao']))
).fillna(''))
Out[408]:
ArCarr ArProd ArTotal Variedade
Quadra Talhao
1 1 16.060 16.060 32.12 RB1
2 16.565 16.565 33.13 RB2
Total 32.625 32.625 65.25
2 3 17.070 17.070 34.14 RB3
4 17.575 17.575 35.15 RB4
Total 34.645 34.645 69.29
Grand Total 67.270 67.270 134.54
Details
细节
In [409]: table
Out[409]:
Variedade ArCarr ArProd ArTotal
Quadra Talhao
1 1 RB1 16.060 16.060 32.12
2 RB2 16.565 16.565 33.13
2 3 RB3 17.070 17.070 34.14
4 RB4 17.575 17.575 35.15
In [410]: Gt
Out[410]:
ArTotal 134.54
ArCarr 67.27
ArProd 67.27
dtype: float64
In [411]: St
Out[411]:
ArCarr ArProd ArTotal
Quadra
1 32.625 32.625 65.25
2 34.645 34.645 69.29
回答by Bharath
I think John's solution beats me, but based on your current output you cant do that with pivot table you can have a series of steps using list comprehension of grouped data and then append the sums to do that i.e.
我认为约翰的解决方案胜过我,但是根据您当前的输出,您无法使用数据透视表执行此操作,您可以使用分组数据的列表理解来执行一系列步骤,然后附加总和来执行此操作,即
cols = ['Fazenda','Variedade','Quadra','Talhao']
ndf = pd.concat([i.append(i.drop(cols,1).sum(),1) for _,i in df.groupby('Quadra')])
ndf['Talhao'] = ndf[['Talhao']].fillna('Total')
ndf['Quadra'] = ndf['Quadra'].ffill()
new = ndf.set_index(['Quadra','Talhao']).drop(['Fazenda'],1)
new = new.append(pd.DataFrame(df.sum()).T.drop(cols,1).set_index(pd.MultiIndex.from_tuples([('Grand Total', '')]))).fillna('')
Output:
输出:
ArCarr ArProd ArTotal Variedade Quadra Talhao 1.0 1.0 16.060 16.060 32.12 RB1 2.0 16.565 16.565 33.13 RB2 Total 32.625 32.625 65.25 2.0 3.0 17.070 17.070 34.14 RB3 4.0 17.575 17.575 35.15 RB4 Total 34.645 34.645 69.29 Grand Total 67.270 67.270 134.54