pandas 从 groupby 对象创建字典,Python

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/23470450/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 22:00:30  来源:igfitidea点击:

Create a dictionary from groupby object,Python

pythondictionarypandasgroup-by

提问by Hypothetical Ninja

Suppose i have a dataframe:

假设我有一个数据框:

df = pd.DataFrame({'Type' : ['Pokemon', 'Pokemon', 'Bird', 'Pokemon', 'Bird', 'Pokemon', 'Pokemon', 'Bird'],'Name' : ['Jerry', 'Jerry', 'Flappy Bird', 'Mudkip','Pigeon', 'Mudkip', 'Jerry', 'Pigeon']})  

and i group it according to the type:

我根据类型对它进行分组:

print df.groupby(['Type','Name'])['Type'].agg({'Frequency':'count'})

                           Frequency
Type    Name                  
Bird    Flappy Bird          1
        Pigeon               2
Pokemon Jerry                3
        Mudkip               2

Could i create a dictionary from the above group ?? The key"Bird"will have a value of list containing ['Pigeon',Flappy Bird']note that higher frequency nameshould appear firstin the Value list.

我可以从上面的组中创建一个字典吗??该"Bird"的值为 list ,其中包含['Pigeon',Flappy Bird']注意较高频率的名称首先出现在Value list 中

Expected Output:

预期输出:

dict1 = { 'Bird':['Pigeon','Flappy Bird'] , 'Pokemon':['Jerry','Mudkip'] }

回答by Ffisegydd

You can create a dictionary using a dictionary comprehension as below

您可以使用字典理解创建字典,如下所示

df = pd.DataFrame({'Type' : ['Pokemon', 'Pokemon', 'Bird', 'Pokemon', 'Bird', 'Pokemon', 'Pokemon', 'Bird'],'Name' : ['Jerry', 'Jerry', 'Flappy Bird', 'Mudkip','Pigeon', 'Mudkip', 'Jerry', 'Pigeon']})  
f = df.groupby(['Type','Name'])['Type'].agg({'Frequency':'count'})
f.sort('Frequency',ascending=False, inplace=True)

d = {k:list(f.ix[k].index) for k in f.index.levels[0]}
print(d)
# {'Bird': ['Pigeon', 'Flappy Bird'], 'Pokemon': ['Jerry', 'Mudkip']}

The dictionary comprehension will iterate through the outer index ('Bird', 'Pokemon') and then set the value as the inner index for your dictionary.

字典理解将遍历外部索引('Bird'、'Pokemon'),然后将该值设置为字典的内部索引。

It is necessary to first sort your MultiIndexby the Frequencycolumn to get the ordering you wish.

有必要首先MultiIndexFrequency列对您进行排序以获得您想要的排序。

回答by DanDy

Here's a one-liner.

这是一个单线。

df.groupby(['Type'])['Name'].apply(lambda grp: list(grp.value_counts().index)).to_dict()

# output
#{'Bird': ['Pigeon', 'Flappy Bird'], 'Pokemon': ['Jerry', 'Mudkip']}

The value_countsfunction implicitly groups the Namefield by count and returns descending order by default.

value_counts函数Name默认按计数对字段进行分组,并默认返回降序。

Bonus: if you want to include counts, you can do the following.

奖励:如果您想包括计数,您可以执行以下操作。

df.groupby(['Type']).apply(lambda grp: grp.groupby('Name')['Type'].count().to_dict()).to_dict()

# {'Bird': {'Flappy Bird': 1, 'Pigeon': 2}, 'Pokemon': {'Jerry': 3, 'Mudkip': 2}}