Python Pandas:将多列汇总为一列,没有最后一列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42063716/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 21:11:16  来源:igfitidea点击:

Pandas: sum up multiple columns into one column without last column

pythonpandassum

提问by Tuutsrednas

If I have a dataframe similar to this one

如果我有一个类似于这个的数据框

Apples   Bananas   Grapes   Kiwis
2        3         nan      1
1        3         7        nan
nan      nan       2        3

I would like to add a column like this

我想添加一个这样的列

Apples   Bananas   Grapes   Kiwis   Fruit Total
2        3         nan      1        6
1        3         7        nan      11
nan      nan       2        3        5

I guess you could use df['Apples'] + df['Bananas']and so on, but my actual dataframe is much larger than this. I was hoping a formula like df['Fruit Total']=df[-4:-1].sumcould do the trick in one line of code. That didn't work however. Is there any way to do it without explicitly summing up all columns?

我想你可以使用df['Apples'] + df['Bananas']等等,但我的实际数据框比这大得多。我希望像df['Fruit Total']=df[-4:-1].sum这样的公式可以在一行代码中完成。然而这并没有奏效。有没有办法在不明确总结所有列的情况下做到这一点?

回答by jezrael

You can first select by ilocand then sum:

您可以先选择 byiloc然后sum

df['Fruit Total']= df.iloc[:, -4:-1].sum(axis=1)
print (df)
   Apples  Bananas  Grapes  Kiwis  Fruit Total
0     2.0      3.0     NaN    1.0          5.0
1     1.0      3.0     7.0    NaN         11.0
2     NaN      NaN     2.0    3.0          2.0

For sum all columns use:

对于总和所有列使用:

df['Fruit Total']= df.sum(axis=1)

回答by Francisco Dura

Using df['Fruit Total']= df.iloc[:, -4:-1].sum(axis=1)over your original df won't add the last column ('Kiwis'), you should use df.iloc[:, -4:]instead to select all columns:

使用df['Fruit Total']= df.iloc[:, -4:-1].sum(axis=1)原始 df 不会添加最后一列('Kiwis'),您应该使用它df.iloc[:, -4:]来选择所有列:

print(df)
   Apples  Bananas  Grapes  Kiwis
0     2.0      3.0     NaN    1.0
1     1.0      3.0     7.0    NaN
2     NaN      NaN     2.0    3.0

df['Fruit Total']=df.iloc[:,-4:].sum(axis=1)

print(df)
   Apples  Bananas  Grapes  Kiwis  Fruit Total
0     2.0      3.0     NaN    1.0          6.0
1     1.0      3.0     7.0    NaN         11.0
2     NaN      NaN     2.0    3.0          5.0

回答by Ramon

It is possible to do it without knowing the number of columns and even without iloc:

可以在不知道列数甚至没有 iloc 的情况下执行此操作:

print(df)
   Apples  Bananas  Grapes  Kiwis
0     2.0      3.0     NaN    1.0
1     1.0      3.0     7.0    NaN
2     NaN      NaN     2.0    3.0

cols_to_sum = df.columns[ : df.shape[1]-1]

df['Fruit Total'] = df[cols_to_sum].sum(axis=1)

print(df)
   Apples   Bananas Grapes  Kiwis   Fruit Total
0  2.0      3.0     NaN     1.0     5.0
1  1.0      3.0     7.0     NaN     11.0
2  NaN      NaN     2.0     3.0     2.0