基于条件的 Python 熊猫数据框分组
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31303417/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python pandas dataframe group by based on a condition
提问by ahajib
My question is simple, I have a dataframe and I groupby
the results based on a column and get the size like this:
我的问题很简单,我有一个数据框,我groupby
的结果基于一列并得到如下大小:
df.groupby('column').size()
Now the problem is that I only want the ones where size is greater than X. I am wondering if I can do it using a lambda function or anything similar? I have already tried this:
现在的问题是我只想要 size 大于X 的那些。我想知道我是否可以使用 lambda 函数或类似的函数来做到这一点?我已经试过了:
df.groupby('column').size() > X
and it prints out some True and False values.
它打印出一些 True 和 False 值。
采纳答案by Ami Tavory
The grouped result is a regular DataFrame, so just filter the results as usual:
分组的结果是一个普通的DataFrame,所以只需像往常一样过滤结果:
import pandas as pd
df = pd.DataFrame({'a': ['a', 'b', 'a', 'a', 'b', 'c', 'd']})
after = df.groupby('a').size()
>> after
a
a 3
b 2
c 1
d 1
dtype: int64
>> after[after > 2]
a
a 3
dtype: int64
回答by Jianxun Li
Try this code:
试试这个代码:
df.groupby('column').filter(lambda group: group.size > X)