pandas 从 groupby 对象创建字典,Python
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/23470450/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Create a dictionary from groupby object,Python
提问by Hypothetical Ninja
Suppose i have a dataframe:
假设我有一个数据框:
df = pd.DataFrame({'Type' : ['Pokemon', 'Pokemon', 'Bird', 'Pokemon', 'Bird', 'Pokemon', 'Pokemon', 'Bird'],'Name' : ['Jerry', 'Jerry', 'Flappy Bird', 'Mudkip','Pigeon', 'Mudkip', 'Jerry', 'Pigeon']})
and i group it according to the type:
我根据类型对它进行分组:
print df.groupby(['Type','Name'])['Type'].agg({'Frequency':'count'})
Frequency
Type Name
Bird Flappy Bird 1
Pigeon 2
Pokemon Jerry 3
Mudkip 2
Could i create a dictionary from the above group ?? The key"Bird"will have a value of list containing ['Pigeon',Flappy Bird']note that higher frequency nameshould appear firstin the Value list.
我可以从上面的组中创建一个字典吗??该键"Bird"的值为 list ,其中包含['Pigeon',Flappy Bird']注意较高频率的名称应首先出现在Value list 中。
Expected Output:
预期输出:
dict1 = { 'Bird':['Pigeon','Flappy Bird'] , 'Pokemon':['Jerry','Mudkip'] }
回答by Ffisegydd
You can create a dictionary using a dictionary comprehension as below
您可以使用字典理解创建字典,如下所示
df = pd.DataFrame({'Type' : ['Pokemon', 'Pokemon', 'Bird', 'Pokemon', 'Bird', 'Pokemon', 'Pokemon', 'Bird'],'Name' : ['Jerry', 'Jerry', 'Flappy Bird', 'Mudkip','Pigeon', 'Mudkip', 'Jerry', 'Pigeon']})
f = df.groupby(['Type','Name'])['Type'].agg({'Frequency':'count'})
f.sort('Frequency',ascending=False, inplace=True)
d = {k:list(f.ix[k].index) for k in f.index.levels[0]}
print(d)
# {'Bird': ['Pigeon', 'Flappy Bird'], 'Pokemon': ['Jerry', 'Mudkip']}
The dictionary comprehension will iterate through the outer index ('Bird', 'Pokemon') and then set the value as the inner index for your dictionary.
字典理解将遍历外部索引('Bird'、'Pokemon'),然后将该值设置为字典的内部索引。
It is necessary to first sort your MultiIndexby the Frequencycolumn to get the ordering you wish.
有必要首先MultiIndex按Frequency列对您进行排序以获得您想要的排序。
回答by DanDy
Here's a one-liner.
这是一个单线。
df.groupby(['Type'])['Name'].apply(lambda grp: list(grp.value_counts().index)).to_dict()
# output
#{'Bird': ['Pigeon', 'Flappy Bird'], 'Pokemon': ['Jerry', 'Mudkip']}
The value_countsfunction implicitly groups the Namefield by count and returns descending order by default.
该value_counts函数Name默认按计数对字段进行分组,并默认返回降序。
Bonus: if you want to include counts, you can do the following.
奖励:如果您想包括计数,您可以执行以下操作。
df.groupby(['Type']).apply(lambda grp: grp.groupby('Name')['Type'].count().to_dict()).to_dict()
# {'Bird': {'Flappy Bird': 1, 'Pigeon': 2}, 'Pokemon': {'Jerry': 3, 'Mudkip': 2}}

