使用 Pandas 将列添加到数据透视表

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/46220167/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:27:07  来源:igfitidea点击:

Add columns to pivot table with pandas

pythonpandasnumpypivot-table

提问by Franklin Januário

I have the table as follow:

我有如下表格:

import pandas as pd
import numpy as np

#simple table
fazenda = [6010,6010,6010,6010]
quadra = [1,1,2,2]
talhao = [1,2,3,4]
arTotal = [32.12,33.13,34.14,35.15]
arCarr = [i/2 for i in arTotal]
arProd = [i/2 for i in arTotal]
varCan = ['RB1','RB2','RB3','RB4']
data = list(zip(fazenda,quadra,talhao,arTotal,arCarr,arProd,varCan))

#Pandas DataFrame
df = pd.DataFrame(data=data,columns=['Fazenda','Quadra','Talhao','ArTotal','ArCarr','ArProd','Variedade'])

#Pivot Table
table = pd.pivot_table(df, values=['ArTotal','ArCarr','ArProd'],index=['Quadra','Talhao'], fill_value=0)

print(table)

resulting in this:

导致这个:

               ArCarr  ArProd  ArTotal
Quadra Talhao                         
1      1       16.060  16.060    32.12
       2       16.565  16.565    33.13
2      3       17.070  17.070    34.14
       4       17.575  17.575    35.15

I need two aditional steps:

我需要两个额外的步骤:

  1. Add the Subtotal and Grand Total for 'ArTotal', 'ArCarr' e 'ArProd' fields
  2. Add 'Variedade' field to table
  1. 为“ArTotal”、“ArCarr”和“ArProd”字段添加小计和总计
  2. 将“Variedade”字段添加到表

Wanted result

Wanted result

I tried to add the column but the result was incorrect. Following some links about Total and Grand Total, I did not get the satisfactory result.

我尝试添加该列,但结果不正确。按照有关 Total 和 Grand Total 的一些链接,我没有得到满意的结果。

I'm having a hard time understanding pandas, I ask for help from more experienced colleagues.

我很难理解Pandas,我向更有经验的同事寻求帮助。

采纳答案by Zero

Get the pivotright first.

先找pivot对。

In [404]: values = ['ArTotal','ArCarr','ArProd']

In [405]: table = pd.pivot_table(df, values=values, index=['Quadra','Talhao','Variedade'], 
                                 fill_value=0).reset_index(level=-1)

Get Grand totals

获取总计

In [406]: Gt = table[values].sum()

Get Quadralevel totals

获取Quadra级别总数

In [407]: St = table.sum(level='Quadra')

Using appendreshape the table

使用append重塑table

In [408]: (table.append(
                 St.assign(Talhao='Total').set_index('Talhao', append=True)
                ).sort_index()
                .append(pd.DataFrame([Gt.values], columns=Gt.index,
                                     index=pd.MultiIndex.from_tuples([('Grand Total', '')],
                                     names=['Quadra', 'Talhao']))
                ).fillna(''))
Out[408]:
                    ArCarr  ArProd  ArTotal Variedade
Quadra      Talhao
1           1       16.060  16.060    32.12       RB1
            2       16.565  16.565    33.13       RB2
            Total   32.625  32.625    65.25
2           3       17.070  17.070    34.14       RB3
            4       17.575  17.575    35.15       RB4
            Total   34.645  34.645    69.29
Grand Total         67.270  67.270   134.54

Details

细节

In [409]: table
Out[409]:
              Variedade  ArCarr  ArProd  ArTotal
Quadra Talhao
1      1            RB1  16.060  16.060    32.12
       2            RB2  16.565  16.565    33.13
2      3            RB3  17.070  17.070    34.14
       4            RB4  17.575  17.575    35.15

In [410]: Gt
Out[410]:
ArTotal    134.54
ArCarr      67.27
ArProd      67.27
dtype: float64

In [411]: St
Out[411]:
        ArCarr  ArProd  ArTotal
Quadra
1       32.625  32.625    65.25
2       34.645  34.645    69.29

回答by Bharath

I think John's solution beats me, but based on your current output you cant do that with pivot table you can have a series of steps using list comprehension of grouped data and then append the sums to do that i.e.

我认为约翰的解决方案胜过我,但是根据您当前的输出,您无法使用数据透视表执行此操作,您可以使用分组数据的列表理解来执行一系列步骤,然后附加总和来执行此操作,即

cols = ['Fazenda','Variedade','Quadra','Talhao']
ndf = pd.concat([i.append(i.drop(cols,1).sum(),1) for _,i in df.groupby('Quadra')])

ndf['Talhao'] = ndf[['Talhao']].fillna('Total')
ndf['Quadra'] = ndf['Quadra'].ffill()

new = ndf.set_index(['Quadra','Talhao']).drop(['Fazenda'],1)

new = new.append(pd.DataFrame(df.sum()).T.drop(cols,1).set_index(pd.MultiIndex.from_tuples([('Grand Total', '')]))).fillna('')

Output:

输出:

                    ArCarr  ArProd  ArTotal Variedade
Quadra      Talhao                                   
1.0         1.0     16.060  16.060    32.12       RB1
            2.0     16.565  16.565    33.13       RB2
            Total   32.625  32.625    65.25          
2.0         3.0     17.070  17.070    34.14       RB3
            4.0     17.575  17.575    35.15       RB4
            Total   34.645  34.645    69.29          
Grand Total         67.270  67.270   134.54