pandas 在函数中使用 groupby 组名
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/37037564/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using groupby group names in function
提问by Jeremy
I have data something like:
我有类似的数据:
df = pd.DataFrame({'user': np.random.choice(['a', 'b','c'], size=100, replace=True),
'value1': np.random.randint(10, size=100),
'value2': np.random.randint(20, size=100)})
I'm using it to produce some results, e.g.,
我用它来产生一些结果,例如,
grouped = df.groupby('user')
results = pd.DataFrame()
results['value2_sum'] = grouped['value2'].sum()
For one of he columns of this result dataframe, I'd like to pass the user names to a different function, which uses data outside of the dataframe.
对于此结果数据帧的其中一列,我想将用户名传递给不同的函数,该函数使用数据帧之外的数据。
I tried something like:
我试过类似的东西:
results['user_result'] = grouped.apply(lambda x: my_func(x.index))
But couldn't figure out a syntax that worked.
但无法找出有效的语法。
回答by EdChum
You want the .name
attribute to access a groups index value:
您希望该.name
属性访问组索引值:
In [6]:
grouped = df.groupby('user')
results = pd.DataFrame()
results['value2_sum'] = grouped['value2'].sum()
results['user_result'] = grouped.apply(lambda x: x.name)
results
Out[6]:
value2_sum user_result
user
a 342 a
b 333 b
c 308 c
回答by Alexander
results['user_result'] = results.index.values
To pass the index value to your function, you can use a list comprehension.
要将索引值传递给您的函数,您可以使用列表推导式。
def my_func(val):
return val + "_" + val
results['my_func'] = [my_func(idx) for idx in results.index]
>>> results
value2_sum user_result my_func
user
a 417 a a_a
b 306 b b_b
c 331 c c_c