pandas 按索引对熊猫系列进行排序

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19144618/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 21:12:56  来源:igfitidea点击:

Sort a pandas Series by the index

pythonpandassorting

提问by marillion

I have a Pandas dataframe called pd, and I extract the number of unique values in one of the columns of this dataframe using the following command:

我有一个名为 的 Pandas 数据框pd,我使用以下命令提取该数据框的一列中唯一值的数量:

b = df.groupby('Region').size()

b is a Pandas series object and looks like this:

b 是 Pandas 系列对象,如下所示:

In [48]: b
Out[48]: 
Region
0          8
1         25
11         1
2         41
3         23
4         15
5         35
6         24
7         27
8         50
9         55
N         10

I am trying to plot a barchart of this series, however, I would like to sort it first based on the first column (because of that 11 between 1 and 2), which will be the x axis labels. I tried passing the sort command, but it sorts the series based on the values in the second column:

我正在尝试绘制该系列的条形图,但是,我想首先根据第一列(因为 1 和 2 之间的 11)对其进行排序,这将是 x 轴标签。我尝试传递 sort 命令,但它根据第二列中的值对系列进行排序:

b.sort()

In [48]: b
Out[54]: 
Region
11         1
0          8
N         10
4         15
3         23
6         24
1         25
7         27
5         35
2         41
8         50
9         55

Well, is there a way to sort this series based on the first column?

那么,有没有办法根据第一列对这个系列进行排序?

采纳答案by Phillip Cloud

You need to convert your index to an object index, because it's currently sorting lexicographically, not numerically:

您需要将索引转换为对象索引,因为它当前按字典顺序排序,而不是按数字排序:

In [97]: s = read_clipboard(header=None)

In [98]: news = s.rename(columns=lambda x: ['Region', 'data'][x])

In [99]: news
Out[99]:
   Region  data
0       0     8
1       1    25
2      11     1
3       2    41
4       3    23
5       4    15
6       5    35
7       6    24
8       7    27
9       8    50
10      9    55
11      N    10

In [100]: news_converted = news.convert_objects(convert_numeric=True)

In [101]: news_converted
Out[101]:
    Region  data
0        0     8
1        1    25
2       11     1
3        2    41
4        3    23
5        4    15
6        5    35
7        6    24
8        7    27
9        8    50
10       9    55
11     NaN    10

In [102]: news_converted.loc[11, 'Region'] = 'N'

In [103]: news_converted_with_index = news_converted.set_index('Region')

In [104]: news_converted_with_index
Out[104]:
        data
Region
0.0        8
1.0       25
11.0       1
2.0       41
3.0       23
4.0       15
5.0       35
6.0       24
7.0       27
8.0       50
9.0       55
N         10

In [105]: news_converted_with_index.sort_index()
Out[105]:
        data
Region
0.0        8
1.0       25
2.0       41
3.0       23
4.0       15
5.0       35
6.0       24
7.0       27
8.0       50
9.0       55
11.0       1
N         10

There's most likely a better way to create your Seriesso that it doesn't mix index types.

很可能有更好的方法来创建您的Series索引类型,这样它就不会混合索引类型。

回答by bdiamante

You are looking for sort_index:

您正在寻找sort_index

In [80]: b.sort_values()
Out[80]: 
6     1
11    2
9     2
1     4
10    4
2     5
3     6
4     7
8     8
5     9
dtype: int64

In [81]: b.sort_index()
Out[81]: 
1     4
2     5
3     6
4     7
5     9
6     1
8     8
9     2
10    4
11    2
dtype: int64

回答by Jeff

There is only 1 'column' of values. The first 'column' is the index. Docs are here

只有 1 个“列”值。第一个“列”是索引。 文档在这里

In [8]: s = Series([3,2,1],index=[1,3,2])

In [9]: s
Out[9]: 
1    3
3    2
2    1
dtype: int64

Sort by the index

按索引排序

In [10]: s.sort_index()
Out[10]: 
1    3
2    1
3    2
dtype: int64

Sort by values

按值排序

In [11]: s.sort_values()
Out[11]: 
2    1
3    2
1    3
dtype: int64