使用 Pandas 将列添加到数据透视表

Question

提问by Franklin Januário

I have the table as follow:

我有如下表格：

import pandas as pd
import numpy as np

#simple table
fazenda = [6010,6010,6010,6010]
quadra = [1,1,2,2]
talhao = [1,2,3,4]
arTotal = [32.12,33.13,34.14,35.15]
arCarr = [i/2 for i in arTotal]
arProd = [i/2 for i in arTotal]
varCan = ['RB1','RB2','RB3','RB4']
data = list(zip(fazenda,quadra,talhao,arTotal,arCarr,arProd,varCan))

#Pandas DataFrame
df = pd.DataFrame(data=data,columns=['Fazenda','Quadra','Talhao','ArTotal','ArCarr','ArProd','Variedade'])

#Pivot Table
table = pd.pivot_table(df, values=['ArTotal','ArCarr','ArProd'],index=['Quadra','Talhao'], fill_value=0)

print(table)

resulting in this:

导致这个：

               ArCarr  ArProd  ArTotal
Quadra Talhao                         
1      1       16.060  16.060    32.12
       2       16.565  16.565    33.13
2      3       17.070  17.070    34.14
       4       17.575  17.575    35.15

I need two aditional steps:

我需要两个额外的步骤：

Add the Subtotal and Grand Total for 'ArTotal', 'ArCarr' e 'ArProd' fields
Add 'Variedade' field to table

为“ArTotal”、“ArCarr”和“ArProd”字段添加小计和总计
将“Variedade”字段添加到表

I tried to add the column but the result was incorrect. Following some links about Total and Grand Total, I did not get the satisfactory result.

我尝试添加该列，但结果不正确。按照有关 Total 和 Grand Total 的一些链接，我没有得到满意的结果。

I'm having a hard time understanding pandas, I ask for help from more experienced colleagues.

我很难理解Pandas，我向更有经验的同事寻求帮助。

Answer 1

采纳答案by Zero

Get the pivotright first.

先找pivot对。

In [404]: values = ['ArTotal','ArCarr','ArProd']

In [405]: table = pd.pivot_table(df, values=values, index=['Quadra','Talhao','Variedade'], 
                                 fill_value=0).reset_index(level=-1)

Get Grand totals

获取总计

In [406]: Gt = table[values].sum()

Get Quadralevel totals

获取Quadra级别总数

In [407]: St = table.sum(level='Quadra')

Using appendreshape the table

使用append重塑table

In [408]: (table.append(
                 St.assign(Talhao='Total').set_index('Talhao', append=True)
                ).sort_index()
                .append(pd.DataFrame([Gt.values], columns=Gt.index,
                                     index=pd.MultiIndex.from_tuples([('Grand Total', '')],
                                     names=['Quadra', 'Talhao']))
                ).fillna(''))
Out[408]:
                    ArCarr  ArProd  ArTotal Variedade
Quadra      Talhao
1           1       16.060  16.060    32.12       RB1
            2       16.565  16.565    33.13       RB2
            Total   32.625  32.625    65.25
2           3       17.070  17.070    34.14       RB3
            4       17.575  17.575    35.15       RB4
            Total   34.645  34.645    69.29
Grand Total         67.270  67.270   134.54

Details

细节

In [409]: table
Out[409]:
              Variedade  ArCarr  ArProd  ArTotal
Quadra Talhao
1      1            RB1  16.060  16.060    32.12
       2            RB2  16.565  16.565    33.13
2      3            RB3  17.070  17.070    34.14
       4            RB4  17.575  17.575    35.15

In [410]: Gt
Out[410]:
ArTotal    134.54
ArCarr      67.27
ArProd      67.27
dtype: float64

In [411]: St
Out[411]:
        ArCarr  ArProd  ArTotal
Quadra
1       32.625  32.625    65.25
2       34.645  34.645    69.29

Answer 2

回答by Bharath

I think John's solution beats me, but based on your current output you cant do that with pivot table you can have a series of steps using list comprehension of grouped data and then append the sums to do that i.e.

我认为约翰的解决方案胜过我，但是根据您当前的输出，您无法使用数据透视表执行此操作，您可以使用分组数据的列表理解来执行一系列步骤，然后附加总和来执行此操作，即

cols = ['Fazenda','Variedade','Quadra','Talhao']
ndf = pd.concat([i.append(i.drop(cols,1).sum(),1) for _,i in df.groupby('Quadra')])

ndf['Talhao'] = ndf[['Talhao']].fillna('Total')
ndf['Quadra'] = ndf['Quadra'].ffill()

new = ndf.set_index(['Quadra','Talhao']).drop(['Fazenda'],1)

new = new.append(pd.DataFrame(df.sum()).T.drop(cols,1).set_index(pd.MultiIndex.from_tuples([('Grand Total', '')]))).fillna('')

Output:

输出：

                    ArCarr  ArProd  ArTotal Variedade
Quadra      Talhao                                   
1.0         1.0     16.060  16.060    32.12       RB1
            2.0     16.565  16.565    33.13       RB2
            Total   32.625  32.625    65.25          
2.0         3.0     17.070  17.070    34.14       RB3
            4.0     17.575  17.575    35.15       RB4
            Total   34.645  34.645    69.29          
Grand Total         67.270  67.270   134.54

使用 Pandas 将列添加到数据透视表

提问by Franklin Januário

采纳答案by Zero

回答by Bharath

相关推荐

最近更新

标签

使用 Pandas 将列添加到数据透视表

提问by Franklin Januário

采纳答案by Zero

回答by Bharath

相关推荐

pandas 熊猫辅助轴

pandas 重命名 csv 文件中的列

Python-Pandas-Dataframe-datetime 转换不包括空值单元格

Pandas - 将一列中的秒数添加到另一列中的日期时间

相关推荐

最近更新

标签