Python Pandas：对给定列的 DataFrame 行求和

Question

提问by Colonel Beauvel

I have the following DataFrame:

我有以下数据帧：

In [1]:

import pandas as pd
df = pd.DataFrame({'a': [1,2,3], 'b': [2,3,4], 'c':['dd','ee','ff'], 'd':[5,9,1]})
df
Out [1]:
   a  b   c  d
0  1  2  dd  5
1  2  3  ee  9
2  3  4  ff  1

I would like to add a column 'e'which is the sum of column 'a', 'b'and 'd'.

我想增加一列'e'是列的总和'a'，'b'和'd'。

Going across forums, I thought something like this would work:

浏览论坛，我认为这样的事情会起作用：

df['e'] = df[['a','b','d']].map(sum)

But it didn't.

但它没有。

I would like to know the appropriate operation with the list of columns ['a','b','d']and dfas inputs.

我想知道列列表['a','b','d']和df作为输入的适当操作。

Answer 1

采纳答案by EdChum

You can just sumand set param axis=1to sum the rows, this will ignore none numeric columns:

您可以只sum设置 paramaxis=1对行求和，这将忽略非数字列：

In [91]:

df = pd.DataFrame({'a': [1,2,3], 'b': [2,3,4], 'c':['dd','ee','ff'], 'd':[5,9,1]})
df['e'] = df.sum(axis=1)
df
Out[91]:
   a  b   c  d   e
0  1  2  dd  5   8
1  2  3  ee  9  14
2  3  4  ff  1   8

If you want to just sum specific columns then you can create a list of the columns and remove the ones you are not interested in:

如果您只想对特定列求和，则可以创建列列表并删除您不感兴趣的列：

In [98]:

col_list= list(df)
col_list.remove('d')
col_list
Out[98]:
['a', 'b', 'c']
In [99]:

df['e'] = df[col_list].sum(axis=1)
df
Out[99]:
   a  b   c  d  e
0  1  2  dd  5  3
1  2  3  ee  9  5
2  3  4  ff  1  7

Answer 2

回答by Alex Riley

If you have just a few columns to sum, you can write:

如果你只有几列要总结，你可以写：

df['e'] = df['a'] + df['b'] + df['d']

This creates new column ewith the values:

这将创建e具有以下值的新列：

   a  b   c  d   e
0  1  2  dd  5   8
1  2  3  ee  9  14
2  3  4  ff  1   8

For longer lists of columns, EdChum's answer is preferred.

对于更长的列列表，首选 EdChum 的答案。

Answer 3

回答by smartse

This is a simpler way using iloc to select which columns to sum:

这是使用 iloc 选择要求和的列的更简单方法：

df['f']=df.iloc[:,0:2].sum(axis=1)
df['g']=df.iloc[:,[0,1]].sum(axis=1)
df['h']=df.iloc[:,[0,3]].sum(axis=1)

Produces:

产生：

   a  b   c  d   e  f  g   h
0  1  2  dd  5   8  3  3   6
1  2  3  ee  9  14  5  5  11
2  3  4  ff  1   8  7  7   4

I can't find a way to combine a range and specific columns that works e.g. something like:

我找不到组合范围和特定列的方法，例如：

df['i']=df.iloc[:,[[0:2],3]].sum(axis=1)
df['i']=df.iloc[:,[0:2,3]].sum(axis=1)

Answer 4

回答by Bibin Johny

Create a list of column names you want to add up.

创建要添加的列名称列表。

df['total']=df.loc[:,list_name].sum(axis=1)

If you want the sum for certain rows, specify the rows using ':'

如果您想要某些行的总和，请使用“：”指定行

Answer 5

回答by Cybernetic

You can simply pass your dataframeinto the following function:

您可以简单地将您的数据框传递到以下函数中：

def sum_frame_by_column(frame, new_col_name, list_of_cols_to_sum):
    frame[new_col_name] = frame[list_of_cols_to_sum].astype(float).sum(axis=1)
    return(frame)

Example:

示例：

I have a dataframe (awards_frame) as follows:

我有一个数据框（awards_frame）如下：

...and I want to create a new column that shows the sum of awards for each row:

...我想创建一个新列，显示每行的奖励总和：

Usage:

用法：

I simply pass my awards_frameinto the function, also specifying the nameof the new column, and a listof column names that are to be summed:

我只是将我的Awards_frame传递给函数，同时指定新列的名称，以及要求和的列名称列表：

sum_frame_by_column(awards_frame, 'award_sum', ['award_1','award_2','award_3'])

Result:

结果：

Answer 6

回答by makarand kulkarni

Following syntax helped me when I have columns in sequence

当我按顺序排列列时，以下语法对我有帮助

awards_frame.values[:,1:4].sum(axis =1)

Python Pandas：对给定列的 DataFrame 行求和

提问by Colonel Beauvel

采纳答案by EdChum

回答by Alex Riley

回答by smartse

回答by Bibin Johny

回答by Cybernetic

回答by makarand kulkarni

相关推荐

最近更新

标签

Python Pandas：对给定列的 DataFrame 行求和

提问by Colonel Beauvel

采纳答案by EdChum

回答by Alex Riley

回答by smartse

回答by Bibin Johny

回答by Cybernetic

回答by makarand kulkarni

相关推荐

Python numpy 和 scipy 中的因子

Python 使用 selenium webdriver 在 windows 上设置 firefox 二进制文件的路径

在 python (ffmpeg) 中运行 cmd

Python 将元素插入 numpy 数组

相关推荐

最近更新

标签