pandas groupby 中“as_index = False”和“reset_index()”的区别

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/51866908/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:56:50  来源:igfitidea点击:

Difference between "as_index = False", and "reset_index()" in pandas groupby

pythonpandaspandas-groupby

提问by Rohith

I just wanted to know what is the difference in the function performed by these 2.

我只是想知道这两个执行的功能有什么区别。

Data:

数据:

import pandas as pd
df = pd.DataFrame({"ID":["A","B","A","C","A","A","C","B"], "value":[1,2,4,3,6,7,3,4]})

as_index=False :

as_index=False :

df_group1 = df.groupby("ID").sum().reset_index()

reset_index() :

重置索引():

df_group2 = df.groupby("ID", as_index=False).sum()

Both of them give the exact same output.

它们都给出了完全相同的输出。

  ID  value
0  A     18
1  B      6
2  C      6

Can anyone tell me what is the difference and any example illustrating the same?

谁能告诉我有什么区别和任何说明相同的例子?

回答by qmeeus

When you use as_index=False, you indicate to groupby()that you don't want to set the column ID as the index (duh!). When both implementation yield the same results, use as_index=Falsebecause it will save you some typing and an unnecessary pandas operation ;)

当您使用 时as_index=False,您表示groupby()您不想将列 ID 设置为索引(废话!)。当两种实现产生相同的结果时,请使用,as_index=False因为它会为您节省一些输入和不必要的Pandas操作;)

However, sometimes, you want to apply more complicated operations on your groups. In those occasions, you might find out that one is more suited than the other.

但是,有时,您希望对组应用更复杂的操作。在这些情况下,您可能会发现一个比另一个更适合。

Example 1:You want to sum the values of three variables (i.e. columns) in a group on both axes.

示例 1:您想对一组中两个轴上的三个变量(即列)的值求和。

Using as_index=Trueallows you to apply a sum over axis=1without specifying the names of the columns, then summing the value over axis 0. When the operation is finished, you can use reset_index(drop=True/False)to get the dataframe under the right form.

Usingas_index=True允许您在axis=1不指定列名称的情况下应用求和,然后对轴 0 上的值求和。 操作完成后,您可以使用reset_index(drop=True/False)正确形式获取数据框。

Example 2:You need to set a value for the group based on the columns in the groupby().

示例 2:您需要根据groupby().

Setting as_index=Falseallow you to check the condition on a common column and not on an index, which is often way easier.

设置as_index=False允许您检查公共列的条件而不是索引,这通常更容易。

At some point, you might come across KeyErrorwhen applying operations on groups. In that case, it is often because you are trying to use a column in your aggregate function that is currently an index of your GroupBy object.

在某些时候,您可能会在KeyError对组应用操作时遇到这种情况。在这种情况下,通常是因为您试图在聚合函数中使用当前是 GroupBy 对象索引的列。