Pandas：对多列求和并在多列中获得结果

Question

提问by Akio Omi

I have a "sample.txt" like this.

我有一个这样的“sample.txt”。

idx A   B   C   D   cat
J   1   2   3   1   x
K   4   5   6   2   x
L   7   8   9   3   y
M   1   2   3   4   y
N   4   5   6   5   z
O   7   8   9   6   z

With this dataset, I want to get sum in row and column. In row, it is not a big deal. I made result like this.

有了这个数据集，我想得到行和列的总和。在行中，这没什么大不了的。我做了这样的结果。

### MY CODE ###
import pandas as pd

df = pd.read_csv('sample.txt',sep="\t",index_col='idx')
df.info()

df2 = df.groupby('cat').sum()
print( df2 )

The result is like this.

结果是这样的。

      A   B   C   D
cat                
x     5   7   9   3
y     8  10  12   7
z    11  13  15  11

But I don't know how to write a code to get result like this. (simply add values in column A and B as well as column C and D)

但我不知道如何编写代码来获得这样的结果。（只需在 A 列和 B 列以及 C 列和 D 列中添加值）

Could anybody help how to write a code?

有人可以帮助如何编写代码吗？

By the way, I don't want to do like this. (it looks too dull, but if it is the only way, I'll deem it)

顺便说一句，我不想这样做。（看起来太沉闷了，但如果是唯一的方法，我会认为它）

df2 = df['A'] + df['B']
df3 = df['C'] + df['D']
df = pd.DataFrame([df2,df3],index=['AB','CD']).transpose()
print( df )

Answer 1

回答by piRSquared

When you pass a dictionary or callable to groupbyit gets applied to an axis. I specified axis one which is columns.

当您传递字典或可调用对象时，groupby它会应用于轴。我指定了第一个轴，即列。

d = dict(A='AB', B='AB', C='CD', D='CD')
df.groupby(d, axis=1).sum()

Answer 2

回答by jezrael

Use concatwith sum:

使用concat有sum：

df = df.set_index('idx')
df = pd.concat([df[['A', 'B']].sum(1), df[['C', 'D']].sum(1)], axis=1, keys=['AB','CD'])
print( df)
     AB  CD
idx        
J     3   4
K     9   8
L    15  12
M     3   7
N     9  11
O    15  15

Answer 3

回答by Alex S

Does this do what you need? By using axis=1 with DataFrame.apply, you can use the data that you want in a row to construct a new column. Then you can drop the columns that you don't want anymore.

这能满足您的需求吗？通过将axis=1 与DataFrame.apply 结合使用，您可以使用一行中所需的数据来构建新列。然后您可以删除不再需要的列。

In [1]: import pandas as pd
In [5]: df = pd.DataFrame(columns=['A', 'B', 'C', 'D'], data=[[1, 2, 3, 4], [1, 2, 3, 4]])

In [6]: df
Out[6]:
   A  B  C  D
0  1  2  3  4
1  1  2  3  4

In [7]: df['CD'] = df.apply(lambda x: x['C'] + x['D'], axis=1)

In [8]: df
Out[8]:
   A  B  C  D  CD
0  1  2  3  4   7
1  1  2  3  4   7

In [13]: df.drop(['C', 'D'], axis=1)
Out[13]:
   A  B  CD
0  1  2   7
1  1  2   7

Pandas：对多列求和并在多列中获得结果

提问by Akio Omi

回答by piRSquared

回答by jezrael

回答by Alex S

相关推荐

最近更新

标签

Pandas：对多列求和并在多列中获得结果

提问by Akio Omi

回答by piRSquared

回答by jezrael

回答by Alex S

相关推荐

pandas groupby：如何计算总数的百分比？

pandas 熊猫绘制多个类别线

pandas 过滤数据框列值大于零？

pandas 分层数据：有效地为每个节点构建一个每个后代的列表

相关推荐

最近更新

标签