将 Pandas groupby 数据行值重塑为列标题

Question

提问by MrGraeme

I am trying to extract grouped row data from a pandas groupby object so that the primary group data ('course' in the example below) act as a row index, the secondary grouped row values act as column headers ('student') and the aggregate values as the corresponding row data ('score').

我正在尝试从 Pandas groupby 对象中提取分组的行数据，以便主要组数据（下面示例中的“课程”）充当行索引，次要分组行值充当列标题（“学生”）和聚合值作为相应的行数据（“分数”）。

So, for example, I would like to transform:

因此，例如，我想转换：

import pandas as pd
import numpy as np

data = {'course_id':[101,101,101,101,102,102,102,102] ,
    'student_id':[1,1,2,2,1,1,2,2],
    'score':[80,85,70,60,90,65,95,80]}

df = pd.DataFrame(data, columns=['course_id', 'student_id','score'])

Which I have grouped by course_id and student_id:

我按 course_id 和 student_id 分组：

group = df.groupby(['course_id', 'student_id']).aggregate(np.mean)
g = pd.DataFrame(group)

Into something like this:

变成这样：

data = {'course':[101,102],'1':[82.5,77.5],'2':[65.0,87.5]}
g3 = pd.DataFrame(data, columns=['course', '1', '2'])

I have spent some time looking through the groupby documentationand I have trawled stack overflow and the like but I'm still not sure how to approach the problem. I would be very grateful if anyone would suggest a sensible way of achieving this for a largish dataset.

我花了一些时间查看groupby 文档，并且已经对堆栈溢出等问题进行了搜索，但我仍然不确定如何解决这个问题。如果有人能提出一种明智的方法来为较大的数据集实现这一目标，我将不胜感激。

Many thanks!

非常感谢！

Edited: to fix g3 example typo

编辑：修复 g3 示例错字

Answer 1

采纳答案by BrenBarn

>>> g.reset_index().pivot('course_id', 'student_id', 'score')
student_id     1     2
course_id             
101         82.5  65.0
102         77.5  87.5

将 Pandas groupby 数据行值重塑为列标题

提问by MrGraeme

采纳答案by BrenBarn

相关推荐

最近更新

标签

将 Pandas groupby 数据行值重塑为列标题

提问by MrGraeme

采纳答案by BrenBarn

相关推荐

pandas 在熊猫数据框上使用 str.contains

pandas python 中没有名为 read_csv 的属性

pandas 在python中检查是否是月底

pandas 来自 Geopandas GeoDataFame 的等值线图

相关推荐

最近更新

标签