pandas 如何使用python pandas基于特定(字符串)列对数据框进行排序?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37693600/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:21:07  来源:igfitidea点击:

how to sort dataframe based on particular (string)columns using python pandas?

pythonpython-2.7sortingpandasdataframe

提问by Sai Rajesh

My Pandas data frame contains the following data:

我的 Pandas 数据框包含以下数据:

product,values
 a1,     10
 a5,     20
 a10,    15
 a2,     45
 a3,     12
 a6,     67

I have to sort this data frame based on the product column. Thus, I would like to get the following output:

我必须根据产品列对这个数据框进行排序。因此,我想得到以下输出:

product,values
 a10,     15
 a6,      67
 a5,      20
 a3,      12
 a2,      45
 a1,      10

Unfortunately, I'm facing the following error:

不幸的是,我面临以下错误:

ErrorDuringImport(path, sys.exc_info())

ErrorDuringImport: problem in views - type 'exceptions.Indentation

ErrorDuringImport(path, sys.exc_info())

ErrorDuringImport:视图中的问题 - 输入“exceptions.Indentation”

回答by jezrael

You can first extractdigitsand cast to intby astype. Then sort_valuesof column sortand last dropthis column:

您可以先并投射到by 。然后是列和最后一列:extractdigitsintastypesort_valuessortdrop

df['sort'] = df['product'].str.extract('(\d+)', expand=False).astype(int)
df.sort_values('sort',inplace=True, ascending=False)
df = df.drop('sort', axis=1)
print (df)
  product  values
2     a10      15
5      a6      67
1      a5      20
4      a3      12
3      a2      45
0      a1      10

It is necessary, because if use only sort_values:

这是必要的,因为如果只使用sort_values

df.sort_values('product',inplace=True, ascending=False)
print (df)
  product  values
5      a6      67
1      a5      20
4      a3      12
3      a2      45
2     a10      15
0      a1      10

Another idea is use natsortlibrary:

另一个想法是使用natsort库:

from natsort import index_natsorted, order_by_index

df = df.reindex(index=order_by_index(df.index, index_natsorted(df['product'], reverse=True)))
print (df)
  product  values
2     a10      15
5      a6      67
1      a5      20
4      a3      12
3      a2      45
0      a1      10