Pandas groupby 对象过滤

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39457130/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:59:54  来源:igfitidea点击:

Pandas groupby object filtering

pythonpandasindexinggroup-byconditional-statements

提问by chattrat423

i have a pandas dataframe

我有一个Pandas数据框

df.columns
Index([u'car_id',u'color',u'make',u'year')]

I would like to create a new FILTERABLE object that has the count of each group (color,make,year);

我想创建一个新的 FILTERABLE 对象,该对象具有每个组的计数(颜色、品牌、年份);

grp = df[[‘color','make','year']].groupby([‘color','make','year']).size()

which will return something like this

这将返回这样的东西

color   make   year     count
black   honda  2011   416

I would like to be able to filter it, however when I try this:

我希望能够过滤它,但是当我尝试这样做时:

grp.filter(lambda x: x[‘color']==‘black')

I receive this error

我收到此错误

TypeError: 'function' object is not iterable

类型错误:“函数”对象不可迭代

How do I leverage a 'groupby' object in order to filter the rows out?

如何利用“groupby”对象来过滤行?

回答by jezrael

I think you need add reset_indexand then output is DataFrame. Last use boolean indexing:

我认为你需要添加reset_index然后输出是DataFrame. 最后使用boolean indexing

df = df[['color','make','year']].groupby(['color','make','year'])
                                .size()
                                .reset_index(name='count')


df1 = df[df.color == 'black']

回答by piRSquared

Option 1
Filter ahead of time

选项 1
提前过滤

cols = ['color','make','year']
df[df.color == 'black', cols].grouby(cols).size()

Option 2Use xsfor index cross sections

选项2使用xs索引截面

cols = ['color','make','year']
grp = df[cols].groupby(cols).size()

df.xs('black', level='color', drop_level=False)

or

或者

df.xs('honda', level='make', drop_level=False)

or

或者

df.xs(2011, level='year', drop_level=False)