使用 Pandas 按列总和的值分组

Question

提问by mazieres

I got lost in Pandas doc and features trying to figure out a way to groupbya DataFrameby the values of the sum of the columns.

我失去了在Pandasdoc和功能尝试的方式计算出到 groupby一个DataFrame由列的和值。

for instance, let say I have the following data :

例如，假设我有以下数据：

In [2]: dat = {'a':[1,0,0], 'b':[0,1,0], 'c':[1,0,0], 'd':[2,3,4]}

In [3]: df = pd.DataFrame(dat)

In [4]: df
Out[4]: 
   a  b  c  d
0  1  0  1  2
1  0  1  0  3
2  0  0  0  4

I would like columns a, band cto be grouped since they all have their sum equal to 1. The resulting DataFrame would have columns labels equals to the sum of the columns it summed. Like this :

我想要 columns a，b并且c要分组，因为它们的总和都等于 1。生成的 DataFrame 的列标签将等于它相加的列的总和。像这样：

Any idea to put me in the good direction ? Thanks in advance !

任何想法让我朝着好的方向发展？提前致谢！

Answer 1

回答by TomAugspurger

Here you go:

干得好：

In [57]: df.groupby(df.sum(), axis=1).sum()
Out[57]: 
   1  9
0  2  2
1  1  3
2  0  4

[3 rows x 2 columns]

df.sum()is your grouper. It sums over the 0 axis (the index), giving you the two groups: 1(columns a, b, and, c) and 9(column d) . You want to group the columns (axis=1), and take the sum of each group.

df.sum()是你的石斑鱼。它在 0 轴（索引）上求和，为您提供两组：1(columns a, b, and, c) 和9(column d) 。您想对列 ( axis=1)进行分组，并计算每组的总和。

Answer 2

回答by LondonRob

Because pandasis designed with database concepts in mind, it's really expected information to be stored together in rows, not in columns. Because of this, it's usually more elegant to do things row-wise. Here's how to solve your problem row-wise:

因为pandas在设计时考虑了数据库概念，所以真正期望信息以行而不是列的形式存储在一起。因此，按行做事通常更优雅。以下是按行解决问题的方法：

dat = {'a':[1,0,0], 'b':[0,1,0], 'c':[1,0,0], 'd':[2,3,4]}
df = pd.DataFrame(dat)

df = df.transpose()
df['totals'] = df.sum(1)

print df.groupby('totals').sum().transpose()
#totals  1  9
#0       2  2
#1       1  3
#2       0  4

使用 Pandas 按列总和的值分组

提问by mazieres

回答by TomAugspurger

回答by LondonRob

相关推荐

最近更新

标签

使用 Pandas 按列总和的值分组

提问by mazieres

回答by TomAugspurger

回答by LondonRob

相关推荐

Python 使用 pandas 和 str.strip 崩溃

pandas 如何在 IPython Notebook 中正确渲染数学表

pandas 如何计算pandas groupby中的所有正值和负值？

pandas 如何在熊猫图中显示中文？

相关推荐

最近更新

标签