pandas 如何使用python pandas基于特定(字符串)列对数据框进行排序?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/37693600/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
how to sort dataframe based on particular (string)columns using python pandas?
提问by Sai Rajesh
My Pandas data frame contains the following data:
我的 Pandas 数据框包含以下数据:
product,values
a1, 10
a5, 20
a10, 15
a2, 45
a3, 12
a6, 67
I have to sort this data frame based on the product column. Thus, I would like to get the following output:
我必须根据产品列对这个数据框进行排序。因此,我想得到以下输出:
product,values
a10, 15
a6, 67
a5, 20
a3, 12
a2, 45
a1, 10
Unfortunately, I'm facing the following error:
不幸的是,我面临以下错误:
ErrorDuringImport(path, sys.exc_info())
ErrorDuringImport: problem in views - type 'exceptions.Indentation
ErrorDuringImport(path, sys.exc_info())
ErrorDuringImport:视图中的问题 - 输入“exceptions.Indentation”
回答by jezrael
You can first extract
digits
and cast to int
by astype
. Then sort_values
of column sort
and last drop
this column:
您可以先并投射到by 。然后是列和最后一列:extract
digits
int
astype
sort_values
sort
drop
df['sort'] = df['product'].str.extract('(\d+)', expand=False).astype(int)
df.sort_values('sort',inplace=True, ascending=False)
df = df.drop('sort', axis=1)
print (df)
product values
2 a10 15
5 a6 67
1 a5 20
4 a3 12
3 a2 45
0 a1 10
It is necessary, because if use only sort_values
:
这是必要的,因为如果只使用sort_values
:
df.sort_values('product',inplace=True, ascending=False)
print (df)
product values
5 a6 67
1 a5 20
4 a3 12
3 a2 45
2 a10 15
0 a1 10
Another idea is use natsort
library:
另一个想法是使用natsort
库:
from natsort import index_natsorted, order_by_index
df = df.reindex(index=order_by_index(df.index, index_natsorted(df['product'], reverse=True)))
print (df)
product values
2 a10 15
5 a6 67
1 a5 20
4 a3 12
3 a2 45
0 a1 10