将 Pandas groupby 数据行值重塑为列标题
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31975139/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Reshaping Pandas groupby data row values into column headers
提问by MrGraeme
I am trying to extract grouped row data from a pandas groupby object so that the primary group data ('course' in the example below) act as a row index, the secondary grouped row values act as column headers ('student') and the aggregate values as the corresponding row data ('score').
我正在尝试从 Pandas groupby 对象中提取分组的行数据,以便主要组数据(下面示例中的“课程”)充当行索引,次要分组行值充当列标题(“学生”)和聚合值作为相应的行数据(“分数”)。
So, for example, I would like to transform:
因此,例如,我想转换:
import pandas as pd
import numpy as np
data = {'course_id':[101,101,101,101,102,102,102,102] ,
'student_id':[1,1,2,2,1,1,2,2],
'score':[80,85,70,60,90,65,95,80]}
df = pd.DataFrame(data, columns=['course_id', 'student_id','score'])
Which I have grouped by course_id and student_id:
我按 course_id 和 student_id 分组:
group = df.groupby(['course_id', 'student_id']).aggregate(np.mean)
g = pd.DataFrame(group)
Into something like this:
变成这样:
data = {'course':[101,102],'1':[82.5,77.5],'2':[65.0,87.5]}
g3 = pd.DataFrame(data, columns=['course', '1', '2'])
I have spent some time looking through the groupby documentationand I have trawled stack overflow and the like but I'm still not sure how to approach the problem. I would be very grateful if anyone would suggest a sensible way of achieving this for a largish dataset.
我花了一些时间查看groupby 文档,并且已经对堆栈溢出等问题进行了搜索,但我仍然不确定如何解决这个问题。如果有人能提出一种明智的方法来为较大的数据集实现这一目标,我将不胜感激。
Many thanks!
非常感谢!
- Edited: to fix g3 example typo
- 编辑:修复 g3 示例错字
采纳答案by BrenBarn
>>> g.reset_index().pivot('course_id', 'student_id', 'score')
student_id 1 2
course_id
101 82.5 65.0
102 77.5 87.5

