Python Pandas 数据框总行

Question

提问by Daniel

I have a dataframe, something like:

我有一个数据框，例如：

     foo  bar  qux
0    a    1    3.14
1    b    3    2.72
2    c    2    1.62
3    d    9    1.41
4    e    3    0.58

and I would like to add a 'total' row to the end of the dataframe:

我想在数据框的末尾添加一个“总计”行：

     foo  bar  qux
0    a    1    3.14
1    b    3    2.72
2    c    2    1.62
3    d    9    1.41
4    e    3    0.58
5    tot  15   9.47

I've tried to use the sumcommand but I end up with a Series, which although I can convert back to a Dataframe, doesn't maintain the data types:

我尝试使用该sum命令，但最终得到了一个系列，虽然我可以转换回数据帧，但不维护数据类型：

tot_row = pd.DataFrame(df.sum()).T
tot_row['foo'] = 'tot'
tot_row.dtypes:
     foo    object
     bar    object
     qux    object

I would like to maintain the data types from the original data frame as I need to apply other operations to the total row, something like:

我想维护原始数据框中的数据类型，因为我需要对总行应用其他操作，例如：

baz = 2*tot_row['qux'] + 3*tot_row['bar']

Answer 1

采纳答案by jmz

Append a totals row with

附加一个总计行

df.append(df.sum(numeric_only=True), ignore_index=True)

The conversion is necessary only if you have a column of strings or objects.

仅当您有一列字符串或对象时才需要进行转换。

It's a bit of a fragile solution so I'd recommend sticking to operations on the dataframe, though. eg.

不过，这是一个脆弱的解决方案，因此我建议坚持对数据帧进行操作。例如。

baz = 2*df['qux'].sum() + 3*df['bar'].sum()

Answer 2

回答by rmschne

Use DataFrame.pivot_tablewith margins=True:

使用DataFrame.pivot_table有margins=True：

import pandas as pd
data = [('a',1,3.14),('b',3,2.72),('c',2,1.62),('d',9,1.41),('e',3,.58)]
df = pd.DataFrame(data, columns=('foo', 'bar', 'qux'))

Original df:

原文df：

  foo  bar   qux
0   a    1  3.14
1   b    3  2.72
2   c    2  1.62
3   d    9  1.41
4   e    3  0.58

Since pivot_tablerequires some sort of grouping (without the indexargument, it'll raise a ValueError: No group keys passed!), and your original index is vacuous, we'll use the foocolumn:

由于pivot_table需要某种分组（没有index参数，它会引发 a ValueError: No group keys passed!），并且您的原始索引是空的，我们将使用该foo列：

df.pivot_table(index='foo',
               margins=True,
               margins_name='total',  # defaults to 'All'
               aggfunc=sum)

Voilà!

瞧！

       bar   qux
foo             
a        1  3.14
b        3  2.72
c        2  1.62
d        9  1.41
e        3  0.58
total   18  9.47

Answer 3

回答by frishrash

Alternative way (verified on Pandas 0.18.1):

替代方式（在 Pandas 0.18.1 上验证）：

import numpy as np
total = df.apply(np.sum)
total['foo'] = 'tot'
df.append(pd.DataFrame(total.values, index=total.keys()).T, ignore_index=True)

Result:

结果：

   foo   bar   qux
0    a     1  3.14
1    b     3  2.72
2    c     2  1.62
3    d     9  1.41
4    e     3  0.58
5  tot    18  9.47

Answer 4

回答by ihightower

Following helped for me to add a column total and row total to a dataframe.

以下帮助我将列总数和行总数添加到数据框中。

Assume dft1 is your original dataframe... now add a column total and row total with the following steps.

假设 dft1 是您的原始数据框...现在通过以下步骤添加列总计和行总计。

from io import StringIO
import pandas as pd

#create dataframe string
dfstr = StringIO(u"""
a;b;c
1;1;1
2;2;2
3;3;3
4;4;4
5;5;5
""")

#create dataframe dft1 from string
dft1 = pd.read_csv(dfstr, sep=";")

## add a column total to dft1
dft1['Total'] = dft1.sum(axis=1)

## add a row total to dft1 with the following steps

sum_row = dft1.sum(axis=0) #get sum_row first
dft1_sum=pd.DataFrame(data=sum_row).T #change it to a dataframe

dft1_sum=dft1_sum.reindex(columns=dft1.columns) #line up the col index to dft1
dft1_sum.index = ['row_total'] #change row index to row_total

dft1.append(dft1_sum) # append the row to dft1

Answer 5

回答by Matthias Kauer

df.loc["Total"] = df.sum()

works for me and I find it easier to remember. Am I missing something? Probably wasn't possible in earlier versions.

对我有用，我发现它更容易记住。我错过了什么吗？在早期版本中可能是不可能的。

I'd actually like to add the total row only temporarily though. Adding it permanently is good for display but makes it a hassle in further calculations.

我实际上只想暂时添加总行。永久添加它有利于显示，但会使进一步计算变得麻烦。

Just found

刚发现

df.append(df.sum().rename('Total'))

This prints what I want in a Jupyter notebook and appears to leave the df itself untouched.

这会在 Jupyter 笔记本中打印出我想要的内容，并且似乎没有触及 df 本身。

Answer 6

回答by SammyRod

This is the way that I do it, by transposing and using the assign method in combination with a lambda function. It makes it simple for me.

这就是我做的方式，通过转置和使用赋值方法与 lambda 函数的结合。这对我来说很简单。

df.T.assign(GrandTotal = lambda x: x.sum(axis=1)).T

Answer 7

回答by Pedro Moisés Camacho Ure?a

Building on JMZ answer

基于 JMZ 答案

df.append(df.sum(numeric_only=True), ignore_index=True)

if you want to continue using your current index you can name the sum series using .rename() as follows:

如果您想继续使用当前索引，您可以使用 .rename() 命名总和系列，如下所示：

df.append(df.sum().rename('Total'))

This will add a row at the bottom of the table.

这将在表格底部添加一行。

Answer 8

回答by Sarah

Building on answer from Matthias Kauer.

基于 Matthias Kauer 的回答。

To add row total:

要添加行总计：

df.loc["Row_Total"] = df.sum()

To add column total,

要添加列总计，

df.loc[:,"Column_Total"] = df.sum(axis=1)

Answer 9

回答by Poudel

This gives total on both rows and columns

这给出了行和列的总数

import numpy as np
import pandas as pd


df = pd.DataFrame({'a': [10,20],'b':[100,200],'c': ['a','b']})

df.loc['Column_Total']= df.sum(numeric_only=True, axis=0)
df.loc[:,'Row_Total'] = df.sum(numeric_only=True, axis=1)

print(df)


                 a      b    c  Row_Total
0             10.0  100.0    a      110.0
1             20.0  200.0    b      220.0
Column_Total  30.0  300.0  NaN      330.0

Python Pandas 数据框总行

提问by Daniel

采纳答案by jmz

回答by rmschne

回答by frishrash

回答by ihightower

回答by Matthias Kauer

回答by SammyRod

回答by Pedro Moisés Camacho Ure?a

回答by Sarah

回答by Poudel

相关推荐

最近更新

标签

Python Pandas 数据框总行

提问by Daniel

采纳答案by jmz

回答by rmschne

回答by frishrash

回答by ihightower

回答by Matthias Kauer

回答by SammyRod

回答by Pedro Moisés Camacho Ure?a

回答by Sarah

回答by Poudel

相关推荐

PYTHON 上的遗传算法和多目标优化：要使用的库/工具？

Python 使用 WHERE 在 SQLAlchemy Core 中进行批量更新

Python Pandas 数据框基于多个 if 语句添加一个字段

Python 'module' 对象没有属性 'choice' - 尝试使用 random.choice

相关推荐

最近更新

标签