pandas 如何计算pandas groupby中的所有正值和负值?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21296945/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to count all positive and negative values in a pandas groupby?
提问by Stanpol
Let's assume we have a table:
假设我们有一张表:
df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'foo'],
'B' : ['one', 'one', 'two', 'three', 'two', 'two', 'one', 'three'],
'C' : np.random.randn(8), 'D' : np.random.randn(8)})
Output:
输出:
A B C D
0 foo one -1.304026 0.237045
1 bar one 0.030488 -0.672931
2 foo two 0.530976 -0.669559
3 bar three -0.004624 -1.604039
4 foo two -0.247809 -1.571291
5 bar two -0.570580 1.454514
6 foo one 1.441081 0.096880
7 foo three 0.296377 1.575791
I want to count how many positive and negative numbers in column C belong to each group in column A and in what proportion. There are much more groups in A than foo and bar, so group names shouldn't be in the code.
我想算一下C列中有多少正数和负数属于A列中的每个组,比例是多少。A 中的组比 foo 和 bar 多得多,因此代码中不应包含组名。
I was trying to groupby A and then filter, but didn't find the right way. Also tried to aggregate with some smart lambda, but didn't succeed.
我试图分组 A 然后过滤,但没有找到正确的方法。还尝试与一些智能 lambda 聚合,但没有成功。
回答by Andy Hayden
You could do this as a one line apply (the first column being negative, the second positive):
您可以将其作为一行应用(第一列是负数,第二列是正数):
In [11]: df.groupby('A').C.apply(lambda x: pd.Series([(x < 0).sum(), (x >= 0).sum()])).unstack()
Out[111]:
0 1
A
bar 2 1
foo 2 3
[2 rows x 2 columns]
However, I think a neater way is to use a dummy column and use value_counts:
但是,我认为更简洁的方法是使用虚拟列并使用value_counts:
In [21]: df['C_sign'] = np.sign(df.C)
In [22]: df.groupby('A').C_sign.value_counts()
Out[22]:
A
bar -1 2
1 1
foo 1 3
-1 2
dtype: int64
In [23]: df.groupby('A').C_sign.value_counts().unstack()
Out[23]:
-1 1
A
bar 2 1
foo 2 3
[2 rows x 2 columns]

