Python groupby.value_counts() 之后的熊猫 reset_index

Question

提问by muon

I am trying to groupby a column and compute value counts on another column.

我正在尝试对一列进行分组并计算另一列上的值计数。

import pandas as pd
dftest = pd.DataFrame({'A':[1,1,1,1,1,1,1,1,1,2,2,2,2,2], 
               'Amt':[20,20,20,30,30,30,30,40, 40,10, 10, 40,40,40]})

print(dftest)

dftest looks like

dftest 看起来像

perform grouping

进行分组

grouper = dftest.groupby('A')
df_grouped = grouper['Amt'].value_counts()

which gives

这使

   A  Amt
1  30     4
   20     3
   40     2
2  40     3
   10     2
Name: Amt, dtype: int64

what I want is to keep top two rows of each group

我想要的是保留每组的前两行

Also, I was perplexed by an error when I tried to reset_index

此外，当我尝试执行时，我对错误感到困惑 reset_index

df_grouped.reset_index()

which gives following error

这给出了以下错误

df_grouped.reset_index() ValueError: cannot insert Amt, already exists

df_grouped.reset_index() ValueError: 无法插入 Amt，已经存在

Answer 1

回答by jezrael

You need parameter namein reset_index, because Seriesname is same as name of one of levels of MultiIndex:

您需要参数namein reset_index，因为Series名称与以下级别之一的名称相同MultiIndex：

df_grouped.reset_index(name='count')

Another solution is renameSeriesname:

另一个解决方案是名称：renameSeries

print (df_grouped.rename('count').reset_index())

   A  Amt  count
0  1   30      4
1  1   20      3
2  1   40      2
3  2   40      3
4  2   10      2

More common solution instead value_countsis aggregate size:

更常见的解决方案value_counts是聚合size：

df_grouped1 =  dftest.groupby(['A','Amt']).size().reset_index(name='count')

print (df_grouped1)
   A  Amt  count
0  1   20      3
1  1   30      4
2  1   40      2
3  2   10      2
4  2   40      3

Python groupby.value_counts() 之后的熊猫 reset_index

提问by muon

回答by jezrael

相关推荐

最近更新

标签

Python groupby.value_counts() 之后的熊猫 reset_index

提问by muon

回答by jezrael

相关推荐

Python 将 Pandas 数据帧转换为 Dask 数据帧

Python 如何在keras中连接两层？

Python ValueError：传递的项目数量错误 - 含义和建议？

Python 如何从一个 subprocess.Popen 命令同步运行多个命令？

相关推荐

最近更新

标签