Python Pandas，将 groupby() 组标签设置为新数据帧中的索引

Question

提问by Okechukwu Ossai

I am a python programming beginner trying to figure out how a group label from groupby operation can be used as index of a new dataframe. For example,

我是一名 Python 编程初学者，试图弄清楚如何将 groupby 操作中的组标签用作新数据帧的索引。例如，

df = pd.DataFrame({'Country': ['USA', 'USA', 'UK', 'China', 'Canada', 'Australia', 'UK', 'China', 'USA'],
            'Year': [1979, 1983, 1987, 1991, 1995, 1999, 2003, 2007, 2011],
            'Medals': [52, 30, 25, 41, 19, 17, 9, 14, 12]})

df:
         Country  Medals  Year
    0        USA      52  1979
    1        USA      30  1983
    2         UK      25  1987
    3      China      41  1991
    4     Canada      19  1995
    5  Australia      17  1999
    6         UK       9  2003
    7      China      14  2007
    8        USA      12  2011

 c1 = df.groupby(df['Country'], as_index=True, sort=False, group_keys=True).size()

c1:
Country
USA          3
UK           2
China        2
Canada       1
Australia    1

I want to create a new dataframe with the above c1 results exactly in that format but I have not been able to do that. Below is what I get:

我想使用上述 c1 结果完全按照该格式创建一个新数据框，但我无法做到这一点。以下是我得到的：

d1 = pd.DataFrame(np.array(c1), columns=['Frequency'])
d1:
   Frequency
0          3
1          2
2          2
3          1
4          1

I want the group labels as index and not the default 0, 1, 2, 3 and 4. This is exactly what I want:

我想要组标签作为索引而不是默认的 0、1、2、3 和 4。这正是我想要的：

Desired Output:
            Frequency
USA             3
UK              2
China           2
Canada          1
Australia       1

Please how can I achieve this? I guess if I create a label with the countries and assign it as index, it might work. However, the original data I'm practising with has so many rows that it will be impossible for me to create a label list. Any ideas will be highly appreciated.

请问我怎样才能做到这一点？我想如果我用国家/地区创建一个标签并将其分配为索引，它可能会起作用。但是，我正在练习的原始数据有很多行，我无法创建标签列表。任何想法将不胜感激。

Answer 1

采纳答案by Josh Rumbut

Edit: let's see how you like this one!

编辑：让我们看看你喜欢这个！

c1 = pd.DataFrame(c1.values, index=c1.index.values, columns=['Frequency'])
print(c1)

    Frequency
USA         3
UK          2
China       2
Canada      1
Australia   1

c1.valuesis roughly equivalent (for our purposes) to np.array(c1)but avoids needing to import numpy.

c1.values大致相当于（出于我们的目的）np.array(c1)但避免了需要导入 numpy.

Original response (doesn't quite work, left for posterity): You are likely looking for the set_indexmethod.

原始回复（不太有效，留给后人）：您可能正在寻找set_index方法。

It should work something like this:

它应该像这样工作：

c1 = df.groupby(df['Country'], as_index=True, sort=False, group_keys=True).size()

c2 = c1.set_index(['Country'])

Let me know if this works for you!

让我知道这是否适合您！

Answer 2

回答by Okechukwu Ossai

Finally, I figured out what seems to be a working solution. I realized that c1 is a series and not a dataframe, with index which is callable by c1.index. So, I improved the code by specifying the index;

最后，我想出了什么似乎是可行的解决方案。我意识到 c1 是一个系列而不是数据帧，其索引可由 c1.index 调用。所以，我通过指定索引来改进代码；

d1 = pd.DataFrame(np.array(c1), index=c1.index, columns=['Frequency'])

d1:

d1：

           Frequency
Country             
USA                3
UK                 2
China              2
Canada             1
Australia          1

I don't know if this is the best solution. Better ideas are still welcome.

我不知道这是否是最好的解决方案。更好的想法仍然受欢迎。

Python Pandas，将 groupby() 组标签设置为新数据帧中的索引

提问by Okechukwu Ossai

采纳答案by Josh Rumbut

回答by Okechukwu Ossai

相关推荐

最近更新

标签

Python Pandas，将 groupby() 组标签设置为新数据帧中的索引

提问by Okechukwu Ossai

采纳答案by Josh Rumbut

回答by Okechukwu Ossai

相关推荐

pandas 使用子图绘制熊猫数据框（子图 = True）：放置图例并使用紧密布局

pandas 禁用 Pylint no member-特定库的 E1101 错误

pandas IndexError：访问pandas.DataFrame时索引越界

使用 Python 将 Pandas 数据帧中的行作为单个文档插入到 mongodb 集合中

相关推荐

最近更新

标签