Python Pandas groupby 多列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42642001/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:08:10  来源:igfitidea点击:

Python Pandas groupby multiple columns

pythonpandasdata-analysis

提问by Kelvin Ng

thank you for your help.

感谢您的帮助。

I have data that looks like this:

我有看起来像这样的数据:

city,  room_type
A, X
A, Y
A, Z
B, X
B, Y
B, Y

I want my end result to look like this:

我希望我的最终结果是这样的:

city, count(X), count(Y), count(z) 
A,  1, 1, 1
B,  1, 2, 0

I am grouping by city and I want to show the count of each room_type in each city.

我按城市分组,我想显示每个城市中每个 room_type 的数量。

Any way to do this with python pandas? Thank you.

有什么办法可以用 python pandas 做到这一点吗?谢谢你。

I learned SQL years ago and think that it may have been possible. I'm sure python can do the same. Thanks!

几年前我学过 SQL 并认为它可能是可能的。我相信 python 也可以这样做。谢谢!

回答by jezrael

You can use crosstabwith renamecolumns:

您可以crosstabrename列一起使用:

df = pd.crosstab(df.city, df.room_type).rename(columns=lambda x: 'count({})'.format(x))
print (df)
room_type  count(X)  count(Y)  count(Z)
city                                   
A                 1         1         1
B                 1         2         0

Another solutions with groupbyand sizeor value_counts, for reshape is used unstack:

使用groupbysizevalue_counts用于重塑的另一种解决方案unstack

df = df.groupby(['city', 'room_type']).size().unstack(fill_value=0)
       .rename(columns=lambda x: 'count({})'.format(x))
print (df)
room_type  count(X)  count(Y)  count(Z)
city                                   
A                 1         1         1
B                 1         2         0


df = df.groupby('city')['room_type'].value_counts().unstack(fill_value=0)
       .rename(columns=lambda x: 'count({})'.format(x))
print (df)
room_type  count(X)  count(Y)  count(Z)
city                                   
A                 1         1         1
B                 1         2         0

回答by piRSquared

A solution jezrael didn't give ;-)

jezrael 没有给出解决方案 ;-)

s = pd.value_counts([tuple(i) for i in df.values.tolist()])
s.index = pd.MultiIndex.from_tuples(s.index.values, names=['city', None])
s.unstack(fill_value=0).rename(columns='count({})'.format).reset_index()

  city  count(X)  count(Y)  count(Z)
0    A         1         1         1
1    B         1         2         0


More involved

更多地参与

cities = pd.unique(df.city)
room_types = pd.unique(df.room_type)
d1 = pd.DataFrame(
    np.zeros((len(cities), len(room_types)), dtype=int),
    cities,
    room_types
)
for r, c in df.values:
    d1.set_value(r, c, d1.get_value(r, c) + 1)

d1.rename(columns='count({})'.format).rename_axis('city').reset_index()


Variation of first solution

第一个解决方案的变化

from collections import Counter

pd.Series(
    Counter(map(tuple, df.values.tolist()))
).unstack(fill_value=0).rename(
    columns='count({})'.format
).rename_axis('city').reset_index()