Python Pandas 数据框总行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/21752399/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 23:32:21  来源:igfitidea点击:

Pandas dataframe total row

pythonpandas

提问by Daniel

I have a dataframe, something like:

我有一个数据框,例如:

     foo  bar  qux
0    a    1    3.14
1    b    3    2.72
2    c    2    1.62
3    d    9    1.41
4    e    3    0.58

and I would like to add a 'total' row to the end of the dataframe:

我想在数据框的末尾添加一个“总计”行:

     foo  bar  qux
0    a    1    3.14
1    b    3    2.72
2    c    2    1.62
3    d    9    1.41
4    e    3    0.58
5    tot  15   9.47

I've tried to use the sumcommand but I end up with a Series, which although I can convert back to a Dataframe, doesn't maintain the data types:

我尝试使用该sum命令,但最终得到了一个系列,虽然我可以转换回数据帧,但不维护数据类型:

tot_row = pd.DataFrame(df.sum()).T
tot_row['foo'] = 'tot'
tot_row.dtypes:
     foo    object
     bar    object
     qux    object

I would like to maintain the data types from the original data frame as I need to apply other operations to the total row, something like:

我想维护原始数据框中的数据类型,因为我需要对总行应用其他操作,例如:

baz = 2*tot_row['qux'] + 3*tot_row['bar']

采纳答案by jmz

Append a totals row with

附加一个总计行

df.append(df.sum(numeric_only=True), ignore_index=True)

The conversion is necessary only if you have a column of strings or objects.

仅当您有一列字符串或对象时才需要进行转换。

It's a bit of a fragile solution so I'd recommend sticking to operations on the dataframe, though. eg.

不过,这是一个脆弱的解决方案,因此我建议坚持对数据帧进行操作。例如。

baz = 2*df['qux'].sum() + 3*df['bar'].sum()

回答by rmschne

Use DataFrame.pivot_tablewith margins=True:

使用DataFrame.pivot_tablemargins=True

import pandas as pd
data = [('a',1,3.14),('b',3,2.72),('c',2,1.62),('d',9,1.41),('e',3,.58)]
df = pd.DataFrame(data, columns=('foo', 'bar', 'qux'))

Original df:

原文df

  foo  bar   qux
0   a    1  3.14
1   b    3  2.72
2   c    2  1.62
3   d    9  1.41
4   e    3  0.58

Since pivot_tablerequires some sort of grouping (without the indexargument, it'll raise a ValueError: No group keys passed!), and your original index is vacuous, we'll use the foocolumn:

由于pivot_table需要某种分组(没有index参数,它会引发 a ValueError: No group keys passed!),并且您的原始索引是空的,我们将使用该foo列:

df.pivot_table(index='foo',
               margins=True,
               margins_name='total',  # defaults to 'All'
               aggfunc=sum)

Voilà!

瞧!

       bar   qux
foo             
a        1  3.14
b        3  2.72
c        2  1.62
d        9  1.41
e        3  0.58
total   18  9.47

回答by frishrash

Alternative way (verified on Pandas 0.18.1):

替代方式(在 Pandas 0.18.1 上验证):

import numpy as np
total = df.apply(np.sum)
total['foo'] = 'tot'
df.append(pd.DataFrame(total.values, index=total.keys()).T, ignore_index=True)

Result:

结果:

   foo   bar   qux
0    a     1  3.14
1    b     3  2.72
2    c     2  1.62
3    d     9  1.41
4    e     3  0.58
5  tot    18  9.47

回答by ihightower

Following helped for me to add a column total and row total to a dataframe.

以下帮助我将列总数和行总数添加到数据框中。

Assume dft1 is your original dataframe... now add a column total and row total with the following steps.

假设 dft1 是您的原始数据框...现在通过以下步骤添加列总计和行总计。

from io import StringIO
import pandas as pd

#create dataframe string
dfstr = StringIO(u"""
a;b;c
1;1;1
2;2;2
3;3;3
4;4;4
5;5;5
""")

#create dataframe dft1 from string
dft1 = pd.read_csv(dfstr, sep=";")

## add a column total to dft1
dft1['Total'] = dft1.sum(axis=1)

## add a row total to dft1 with the following steps

sum_row = dft1.sum(axis=0) #get sum_row first
dft1_sum=pd.DataFrame(data=sum_row).T #change it to a dataframe

dft1_sum=dft1_sum.reindex(columns=dft1.columns) #line up the col index to dft1
dft1_sum.index = ['row_total'] #change row index to row_total

dft1.append(dft1_sum) # append the row to dft1

回答by Matthias Kauer

df.loc["Total"] = df.sum()

works for me and I find it easier to remember. Am I missing something? Probably wasn't possible in earlier versions.

对我有用,我发现它更容易记住。我错过了什么吗?在早期版本中可能是不可能的。

I'd actually like to add the total row only temporarily though. Adding it permanently is good for display but makes it a hassle in further calculations.

我实际上只想暂时添加总行。永久添加它有利于显示,但会使进一步计算变得麻烦。

Just found

刚发现

df.append(df.sum().rename('Total'))

This prints what I want in a Jupyter notebook and appears to leave the df itself untouched.

这会在 Jupyter 笔记本中打印出我想要的内容,并且似乎没有触及 df 本身。

回答by SammyRod

This is the way that I do it, by transposing and using the assign method in combination with a lambda function. It makes it simple for me.

这就是我做的方式,通过转置和使用赋值方法与 lambda 函数的结合。这对我来说很简单。

df.T.assign(GrandTotal = lambda x: x.sum(axis=1)).T

回答by Pedro Moisés Camacho Ure?a

Building on JMZ answer

基于 JMZ 答案

df.append(df.sum(numeric_only=True), ignore_index=True)

if you want to continue using your current index you can name the sum series using .rename() as follows:

如果您想继续使用当前索引,您可以使用 .rename() 命名总和系列,如下所示:

df.append(df.sum().rename('Total'))

This will add a row at the bottom of the table.

这将在表格底部添加一行。

回答by Sarah

Building on answer from Matthias Kauer.

基于 Matthias Kauer 的回答。

To add row total:

要添加行总计:

df.loc["Row_Total"] = df.sum()

To add column total,

要添加列总计,

df.loc[:,"Column_Total"] = df.sum(axis=1)

回答by Poudel

This gives total on both rows and columns

这给出了行和列的总数

import numpy as np
import pandas as pd


df = pd.DataFrame({'a': [10,20],'b':[100,200],'c': ['a','b']})

df.loc['Column_Total']= df.sum(numeric_only=True, axis=0)
df.loc[:,'Row_Total'] = df.sum(numeric_only=True, axis=1)

print(df)


                 a      b    c  Row_Total
0             10.0  100.0    a      110.0
1             20.0  200.0    b      220.0
Column_Total  30.0  300.0  NaN      330.0