pandas 按字符串长度对数据框进行排序
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/42516616/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Sort dataframe by string length
提问by AlexG
I want to sort by name length. There doesn't appear to be a key
parameter for sort_values
so I'm not sure how to accomplish this. Here is a test df:
我想按名称长度排序。似乎没有key
参数,sort_values
所以我不确定如何完成此操作。这是一个测试df:
import pandas as pd
df = pd.DataFrame({'name': ['Steve', 'Al', 'Markus', 'Greg'], 'score': [2, 4, 2, 3]})
回答by jezrael
You can use reindex
of index
of Series
created by len
with sort_values
:
您可以使用reindex
的index
的Series
通过创建len
具有sort_values
:
print (df.name.str.len())
0 5
1 2
2 6
3 4
Name: name, dtype: int64
print (df.name.str.len().sort_values())
1 2
3 4
0 5
2 6
Name: name, dtype: int64
s = df.name.str.len().sort_values().index
print (s)
Int64Index([1, 3, 0, 2], dtype='int64')
print (df.reindex(s))
name score
1 Al 4
3 Greg 3
0 Steve 2
2 Markus 2
df1 = df.reindex(s)
df1 = df1.reset_index(drop=True)
print (df1)
name score
0 Al 4
1 Greg 3
2 Steve 2
3 Markus 2
回答by moshfiqur
I found this solution more intuitive, specially if you want to do something depending on the column length later on.
我发现这个解决方案更直观,特别是如果你以后想根据列的长度做一些事情。
df['length'] = df['name'].str.len()
df.sort_values('length', ascending=False, inplace=True)
Now your dataframe will have a column with name length
with the value of string length from column name
in it and the whole dataframe will be sorted in descending order.
现在您的数据框将有一个名称为列length
的字符串长度值的列name
,整个数据框将按降序排序。
回答by Thierry G.
The answer of @jezrael is great and explains well. Here is the final result :
@jezrael 的回答很棒并且解释得很好。这是最终结果:
index_sorted = df.name.str.len().sort_values(ascending=True).index
df_sorted = df.reindex(index_sorted)
df_sorted = df_sorted.reset_index(drop=True)