Python 如何一步重置所有组的DataFrame索引?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/22407798/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 00:54:02  来源:igfitidea点击:

How to reset a DataFrame's indexes for all groups in one step?

pythongroup-bypandas

提问by Meloun

I've tried to split my dataframe to groups

我试图将我的数据框拆分为组

df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
                       'foo', 'bar', 'foo', 'foo'],
                   'B' : ['1', '2', '3', '4',
                       '5', '6', '7', '8'],
                   })

grouped = df.groupby('A')

I get 2 groups

我有 2 组

     A  B
0  foo  1
2  foo  3
4  foo  5
6  foo  7
7  foo  8

     A  B
1  bar  2
3  bar  4
5  bar  6

Now I want to reset indexes for each group separately

现在我想分别为每个组重置索引

print grouped.get_group('foo').reset_index()
print grouped.get_group('bar').reset_index()

Finally I get the result

最后我得到了结果

     A  B
0  foo  1
1  foo  3
2  foo  5
3  foo  7
4  foo  8

     A  B
0  bar  2
1  bar  4
2  bar  6

Is there better way how to do this? (For example: automatically call some method for each group)

有没有更好的方法来做到这一点?(例如:为每个组自动调用一些方法)

回答by Greg

Something like this would work:

像这样的事情会起作用:

for group, index in grouped.indices.iteritems():
    grouped.indices[group] = range(0, len(index))

You could probably make it less verbose if you wanted to.

如果你愿意,你可以让它不那么冗长。

回答by Andy Hayden

Pass in as_index=Falseto the groupby, then you don't need to reset_indexto make the groupby-d columns columns again:

传入as_index=False到GROUPBY,那么你就需要reset_index再次进行GROUPBY-d列列:

In [11]: grouped = df.groupby('A', as_index=False)

In [12]: grouped.get_group('foo')
Out[12]:
     A  B
0  foo  1
2  foo  3
4  foo  5
6  foo  7
7  foo  8

Note: As pointed out (and seen in the above example) the index above is not[0, 1, 2, ...], I claim that this will never matter in practice - if it does you're going to have to just through some strange hoops - it's going to be more verbose, less readable and less efficient...

注意:正如所指出的(并在上面的例子中看到)上面的索引不是[0, 1, 2, ...],我声称这在实践中永远不会重要 - 如果是这样,你将不得不通过一些奇怪的箍 - 它会更多冗长,可读性差,效率低......

回答by BAC83

Isn't this just grouped = grouped.apply(lambda x: x.reset_index())?

这不只是grouped = grouped.apply(lambda x: x.reset_index())吗?

回答by Songhua Hu

df=df.groupby('A').apply(lambda x: x.reset_index(drop=True)).drop('A',axis=1).reset_index()

回答by yogitha jaya reddy gari

df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
                       'foo', 'bar', 'foo', 'foo'],
                   'B' : ['1', '2', '3', '4',
                       '5', '6', '7', '8'],
                   })
grouped = df.groupby('A',as_index = False)

we get two groups

我们有两组

grouped_index = grouped.apply(lambda x: x.reset_index(drop = True)).reset_index()

Result in two new columns level_0 and level_1 getting added and the index is reset

导致添加两个新列 level_0 和 level_1 并重置索引


 level_0level_1 A   B
0   0     0    bar  2
1   0     1    bar  4
2   0     2    bar  6
3   1     0    foo  1
4   1     1    foo  3
5   1     2    foo  5
6   1     3    foo  7
7   1     4    foo  8
result = grouped_index.drop('level_0',axis = 1).set_index('level_1')

Creates an index within each group of "A"

在每组“A”中创建一个索引

          A     B
level_1     
0        bar    2
1        bar    4
2        bar    6
0        foo    1
1        foo    3
2        foo    5
3        foo    7
4        foo    8