pandas 按两列（或更多）对熊猫数据框进行分组？

Question

提问by waitingkuo

I have the following dataframe:

我有以下数据框：

mydf = pandas.DataFrame({"cat": ["first", "first", "first", "second", "second", "third"], "class": ["A", "A", "A", "B", "B", "C"], "name": ["a1", "a2", "a3", "b1", "b2", "c1"], "val": [1,5,1,1,2,10]})

I want to create a dataframe that makes summary statistics about the valcolumn of items with the same classid. For this I use groupbyas follows:

我想创建一个数据框，对val具有相同classid的项目列进行汇总统计。为此，我使用groupby如下：

mydf.groupby("class").val.sum()

that's the correct behavior, but I'd like to retain the catcolumn information in the resulting df. can that be done? do I have to merge/jointhat info in later? I tried:

这是正确的行为，但我想cat在生成的 df 中保留列信息。可以做到吗？我以后必须要merge/join这些信息吗？我试过：

mydf.groupby(["cat", "class"]).val.sum()

but this uses hierarchical indexing. I'd like to have a plain dataframe back that just has the catvalue for each group, where the group by is class. The output should be a dataframe (not series) with the values of cat and class, where the valentries are summed over each entry that has the same class:

但这使用分层索引。我想要一个简单的数据框，它只包含cat每个组的值，其中 group by 是class. 输出应该是具有 cat 和 class 值的数据帧（不是系列），其中val条目在每个具有相同的条目上求和class：

cat     class    val
first   A         7
second  B         3
third   C        10

is this possible?

这可能吗？

Answer 1

回答by waitingkuo

Use reset_index

用 reset_index

In [9]: mydf.groupby(['cat', "class"]).val.sum().reset_index()
Out[9]: 
      cat class  val
0   first     A    7
1  second     B    3
2   third     C   10

EDIT

编辑

set level=1 if you want to set catas index

如果要设置cat为索引，请设置 level=1

In [10]: mydf.groupby(['cat', "class"]).val.sum().reset_index(level=1)
Out[10]: 
       class  val
cat              
first      A    7
second     B    3
third      C   10

You can also set as_index=Falseto get the same output

您也可以设置as_index=False以获得相同的输出

In [29]: mydf.groupby(['cat', "class"], as_index=False).val.sum()
Out[29]: 
      cat class  val
0   first     A    7
1  second     B    3
2   third     C   10

pandas 按两列（或更多）对熊猫数据框进行分组？

提问by waitingkuo

回答by waitingkuo

EDIT

编辑

相关推荐

最近更新

标签

pandas 按两列（或更多）对熊猫数据框进行分组？

提问by waitingkuo

回答by waitingkuo

EDIT

编辑

相关推荐

pandas 在将数据帧写入 csv 文件时解决错误“分隔符必须是 1 个字符的字符串”

使用 Pandas 数据帧中的值注释热图

如何从 Python Pandas 系列或数据框中的行中删除省略号，当长行/宽列被截断时显示？

pandas 如何使用 groupby 获得熊猫的月均值

相关推荐

最近更新

标签