Python:从某个列表中获取最多 N 个元素
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4215472/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python: take max N elements from some list
提问by Albert
Is there some function which would return me the N highest elements from some list?
是否有一些函数可以从某个列表中返回 N 个最高元素?
I.e. if max(l)returns the single highest element, sth. like max(l, count=10)would return me a list of the 10 highest numbers (or less if lis smaller).
即如果max(l)返回单个最高元素,sth。喜欢max(l, count=10)会返回给我 10 个最高数字的列表(如果l较小,则更少)。
Or what would be an efficient easy way to get these? (Except the obvious canonical implementation; also, no such things which involve sorting the whole list first because that would be inefficient compared to the canonical solution.)
或者什么是获得这些的有效简单方法?(除了明显的规范实现;此外,没有涉及首先对整个列表进行排序的事情,因为与规范解决方案相比,这将是低效的。)
采纳答案by Gareth Rees
>>> import heapq, random
>>> heapq.nlargest(3, (random.gauss(0, 1) for _ in xrange(100)))
[1.9730767232998481, 1.9326532289091407, 1.7762926716966254]
回答by Gintautas Miliauskas
A fairly efficient solution is a variation of quicksort where recursion is limited to the right part of the pivot until the pivot point position is higher than the number of elements required (with a few extra conditions to deal with border cases of course).
一个相当有效的解决方案是快速排序的变体,其中递归仅限于支点的右侧部分,直到支点位置高于所需元素的数量(当然还有一些额外的条件来处理边界情况)。
The standard library has heapq.nlargest, as pointed out by others here.
heapq.nlargest正如其他人在此处指出的那样,标准库具有。
回答by Spacedman
Start with the first 10 from L, call that X. Note the minimum value of X.
从 L 的前 10 个开始,称为 X。注意 X 的最小值。
Loop over L[i] for i over the rest of L.
在 L[i] 上循环 L[i] for i 的其余部分。
If L[i] is greater than min(X), drop min(X) from X and insert L[i]. You may need to keep X as a sorted linked list and do an insertion. Update min(X).
如果 L[i] 大于 min(X),则从 X 中删除 min(X) 并插入 L[i]。您可能需要将 X 保留为已排序的链表并进行插入。更新 min(X)。
At the end, you have the 10 largest values in X.
最后,您将获得 X 中的 10 个最大值。
I suspect that will be O(kN) (where k is 10 here) since insertion sort is linear. Might be what gsl uses, so if you can read some C code:
我怀疑这将是 O(kN)(这里 k 是 10),因为插入排序是线性的。可能是 gsl 使用的,所以如果您可以阅读一些 C 代码:
http://www.gnu.org/software/gsl/manual/html_node/Selecting-the-k-smallest-or-largest-elements.html
http://www.gnu.org/software/gsl/manual/html_node/Selecting-the-k-smallest-or-largest-elements.html
Probably something in numpy that does this.
可能 numpy 中的某些东西可以做到这一点。
回答by Dave Webb
The function in the standard library that does this is heapq.nlargest
执行此操作的标准库中的函数是 heapq.nlargest

