Pandas sort_values 不能正确排序数字
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/47914274/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas sort_values does not sort numbers correctly
提问by Newkid
I'm new to pandas and working with tabular data in a programming environment. I have sorted a dataframe by a specific column but the answer that panda spits out is not exactly correct.
我是 Pandas 的新手,在编程环境中使用表格数据。我已经按特定列对数据框进行了排序,但Pandas吐出的答案并不完全正确。
Here is the code I have used:
这是我使用的代码:
league_dataframe.sort_values('overall_league_position')
The result that the sort method yields values in column 'overall league position' are not sorted in ascending or order which is the default for the method.
sort 方法在列“整体联赛排名”中产生值的结果不是按升序或顺序排序,这是该方法的默认值。
What am I doing wrong? Thanks for your patience!
我究竟做错了什么?谢谢你的耐心!
回答by cs95
For whatever reason, you seem to be working with a column of strings, and sort_values
is returning you a lexsorted result.
无论出于何种原因,您似乎正在处理一列字符串,并sort_values
返回一个词法排序结果。
Here's an example.
这是一个例子。
df = pd.DataFrame({"Col": ['1', '2', '3', '10', '20', '19']})
df
Col
0 1
1 2
2 3
3 10
4 20
5 19
df.sort_values('Col')
Col
0 1
3 10
5 19
1 2
4 20
2 3
The remedy is to convert it to numeric, either using .astype
or pd.to_numeric
.
补救方法是使用.astype
或将其转换为数字pd.to_numeric
。
df.Col = df.Col.astype(float)
Or,
或者,
df.Col = pd.to_numeric(df.Col, errors='coerce')
df.sort_values('Col')
Col
0 1
1 2
2 3
3 10
5 19
4 20
The only difference b/w astype
and pd.to_numeric
is that the latter is more robust at handling non-numeric strings (they're coerced to NaN
), and will attempt to preserve integers if a coercion to float is not necessary (as is seen in this case).
唯一的区别的B / W astype
,并pd.to_numeric
为后者在处理非数字字符串(他们被迫以更强大的NaN
),并会尝试保留整数,如果强迫浮动是没有必要的(因为在这种情况下看到的) .