python pandas 3 个最小值和 3 个最大值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20415414/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
python pandas 3 smallest & 3 largest values
提问by user1802143
How can I find the index of the 3 smallest and 3 largest values in a column in my pandas dataframe? I saw ways to find max and min, but none to get the 3.
如何在 Pandas 数据框中的列中找到 3 个最小和 3 个最大值的索引?我看到了找到最大值和最小值的方法,但没有找到 3。
回答by TomAugspurger
What have you tried? You could sort with s.sort()and then call s.head(3).indexand s.tail(3).index.
你试过什么?你可以排序s.sort()然后调用s.head(3).indexand s.tail(3).index。
回答by Phil Cooper
回答by Andy Hayden
With smaller Series, you're better off just sorting then taking head/tail!
对于较小的系列,您最好先排序,然后再取头/尾!
This is a pandas feature request, should see in 0.14 (need to overcome some fiddly bits with different dtypes), an efficient solution for larger Series (> 1000 elements) is using kth_smallestfrom pandas algos (warning this function mutates the array it's applied to so use a copy!):
这是一个Pandas 功能请求,应该在 0.14 中看到(需要克服一些具有不同 dtypes 的繁琐位),大系列(> 1000 个元素)的有效解决方案kth_smallest来自 Pandas 算法(警告此函数会改变它所应用的数组,因此使用副本!):
In [11]: s = pd.Series(np.random.randn(10))
In [12]: s
Out[12]:
0 0.785650
1 0.969103
2 -0.618300
3 -0.770337
4 1.532137
5 1.367863
6 -0.852839
7 0.967317
8 -0.603416
9 -0.889278
dtype: float64
In [13]: n = 3
In [14]: pd.algos.kth_smallest(s.values.astype(float), n - 1)
Out[14]: -0.7703374582084163
In [15]: s[s <= pd.algos.kth_smallest(s.values.astype(float), n - 1)]
Out[15]:
3 -0.770337
6 -0.852839
9 -0.889278
dtype: float64
If you want this in order:
如果你想要这个:
In [16]: s[s <= pd.algos.kth_smallest(s.values.astype(float), n - 1)].order()
Out[16]:
9 -0.889278
6 -0.852839
3 -0.770337
dtype: float64
If you're worried about duplicates (join nth place) you can take the head:
如果您担心重复(加入第 n 个位置),您可以采取以下措施:
In [17]: s[s <= pd.algos.kth_smallest(s.values.astype(float), n - 1)].order().head(n)
Out[17]:
9 -0.889278
6 -0.852839
3 -0.770337
dtype: float64
回答by Surya
In [55]: import numpy as np
In [56]: import pandas as pd
In [57]: s = pd.Series(np.random.randn(5))
In [58]: s
Out[58]:
0 0.152037
1 0.194204
2 0.296090
3 1.071013
4 -0.324589
dtype: float64
In [59]: s.nsmallest(3) ## s.drop_duplicates().nsmallest(3); if duplicates exists
Out[59]:
4 -0.324589
0 0.152037
1 0.194204
dtype: float64
In [60]: s.nlargest(3) ## s.drop_duplicates().nlargest(3); if duplicates exists
Out[60]:
3 1.071013
2 0.296090
1 0.194204
dtype: float64
回答by ramakrishnareddy
import pandas as pd
import numpy as np
np.random.seed(1)
x=np.random.randint(1,100,10)
y=np.random.randint(1000,10000,10)
x
array([38, 13, 73, 10, 76, 6, 80, 65, 17, 2])
y
array([8751, 4462, 6396, 6374, 3962, 3516, 9444, 4562, 5764, 9093])
data=pd.DataFrame({"age":x,
"salary":y})
data.nlargest(5,"age").nsmallest(5,"salary")

