Python 排序数据框后更新索引

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/33165734/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 12:55:05  来源:igfitidea点击:

Update index after sorting data-frame

pythonpandas

提问by Lemming

Take the following data-frame:

取以下数据框:

x = np.tile(np.arange(3),3)
y = np.repeat(np.arange(3),3)
df = pd.DataFrame({"x": x, "y": y})
   x  y
0  0  0
1  1  0
2  2  0
3  0  1
4  1  1
5  2  1
6  0  2
7  1  2
8  2  2

I need to sort it by xfirst, and only second by y:

我需要对它进行排序x第一,也是唯一由二y

df2 = df.sort(["x", "y"])
   x  y
0  0  0
3  0  1
6  0  2
1  1  0
4  1  1
7  1  2
2  2  0
5  2  1
8  2  2

How can I change the index such that it is ascending again. I.e. how do I get this:

如何更改索引以使其再次上升。即我如何得到这个:

   x  y
0  0  0
1  0  1
2  0  2
3  1  0
4  1  1
5  1  2
6  2  0
7  2  1
8  2  2

I have tried the following. Unfortunately, it doesn't change the index at all:

我尝试了以下方法。不幸的是,它根本不会改变索引:

df2.reindex(np.arange(len(df2.index)))

采纳答案by joris

You can resetthe index using reset_indexto get back a default index of 0, 1, 2, ..., n-1 (and use drop=Trueto indicate you want to drop the existing index instead of adding it as an additional column to your dataframe):

您可以使用重置索引reset_index来恢复默认索引 0, 1, 2, ..., n-1 (并用于drop=True指示您要删除现有索引而不是将其作为附加列添加到数据帧中) :

In [19]: df2 = df2.reset_index(drop=True)

In [20]: df2
Out[20]:
   x  y
0  0  0
1  0  1
2  0  2
3  1  0
4  1  1
5  1  2
6  2  0
7  2  1
8  2  2

回答by Gregg

You can set new indices by using set_index:

您可以使用set_index以下方法设置新索引:

df2.set_index(np.arange(len(df2.index)))

Output:

输出:

   x  y
0  0  0
1  0  1
2  0  2
3  1  0
4  1  1
5  1  2
6  2  0
7  2  1
8  2  2

回答by aaronpenne

df.sort()is deprecated, use df.sort_values(...): https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.sort_values.html

df.sort()已弃用,请使用df.sort_values(...)https: //pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.sort_values.html

Then follow joris' answer by doing df.reset_index(drop=True)

然后按照乔里斯的回答做 df.reset_index(drop=True)

回答by David

Since pandas 1.0.0 df.sort_valueshas a new parameter ignore_indexwhich does exactly what you need:

由于 pandas 1.0.0df.sort_values有一个新参数ignore_index,它完全符合您的需要:

In [1]: df2 = df.sort_values(by=['x','y'],ignore_index=True)

In [2]: df2
Out[2]:
   x  y
0  0  0
1  0  1
2  0  2
3  1  0
4  1  1
5  1  2
6  2  0
7  2  1
8  2  2