Python heapq 库中函数的时间复杂度是多少

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38806202/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 21:26:44  来源:igfitidea点击:

What's the time complexity of functions in heapq library

pythonheap

提问by Jim Mischel

My question is from the solution in leetcode below, I can't understand why it is O(k+(n-k)log(k)).

我的问题来自下面 leetcode 中的解决方案,我不明白为什么会这样O(k+(n-k)log(k))

Supplement: Maybe the complexity isn't that, in fact I don't know the time complexity of heappush()and heappop()

补充:也许复杂不说,其实我不知道的时间复杂度heappush()heappop()

# O(k+(n-k)lgk) time, min-heap
def findKthLargest(self, nums, k):
    heap = []
    for num in nums:
        heapq.heappush(heap, num)
    for _ in xrange(len(nums)-k):
        heapq.heappop(heap)
    return heapq.heappop(heap)

回答by Jim Mischel

heapqis a binary heap, with O(log n) pushand O(log n) pop. See the heapq source code.

heapq是一个二进制堆,具有 O(log n)push和 O(log n) pop。请参阅heapq 源代码

The algorithm you show takes O(n log n) to push all the items onto the heap, and then O((n-k) log n) to find the kth largest element. So the complexity would be O(n log n). It also requires O(n) extra space.

您展示的算法需要 O(n log n) 将所有项目推送到堆上,然后 O((nk) log n) 找到第 k 个最大的元素。所以复杂度是 O(n log n)。它还需要 O(n) 额外空间。

You can do this in O(n log k), using O(k) extra space by modifying the algorithm slightly. I'm not a Python programmer, so you'll have to translate the pseudocode:

您可以在 O(n log k) 中完成此操作,通过稍微修改算法使用 O(k) 额外空间。我不是 Python 程序员,所以你必须翻译伪代码:

create a new min-heap
push the first k nums onto the heap
for the rest of the nums:
    if num > heap.peek()
        heap.pop()
        heap.push(num)

// at this point, the k largest items are on the heap.
// The kth largest is the root:

return heap.pop()

The key here is that the heap contains just the largest items seen so far. If an item is smaller than the kth largest seen so far, it's never put onto the heap. The worst case is O(n log k).

这里的关键是堆只包含迄今为止看到的最大的项目。如果一个项目小于目前看到的第 k 个最大的项目,它永远不会被放入堆中。最坏的情况是 O(n log k)。

Actually, heapqhas a heapreplacemethod, so you could replace this:

实际上,heapq有一个heapreplace方法,所以你可以替换这个:

    if num > heap.peek()
        heap.pop()
        heap.push(num)

with

    if num > heap.peek()
        heap.replace(num)

Also, an alternative to pushing the first kitems is to create a list of the first kitems and call heapify. A more optimized (but still O(n log k)) algorithm is:

此外,推送第一个k项目的替代方法是创建第一个项目的列表k并调用heapify. 一个更优化的(但仍然是 O(n log k))算法是:

create array of first `k` items
heap = heapify(array)
for remaining nums
    if (num > heap.peek())
        heap.replace(num)
return heap.pop()

You could also call heapifyon the entire array, then pop the first n-kitems, and then take the top:

您也可以调用heapify整个数组,然后弹出第一个n-k项目,然后取顶部:

heapify(nums)
for i = 0 to n-k
    heapq.heappop(nums)
return heapq.heappop(nums)

That's simpler. Not sure if it's faster than my previous suggestion, but it modifies the original array. The complexity is O(n) to build the heap, then O((n-k) log n) for the pops. So it's be O((n-k) log n). Worst case O(n log n).

那更简单。不确定它是否比我之前的建议快,但它修改了原始数组。构建堆的复杂度是 O(n),然后是 O((nk) log n) 的 pops。所以它是 O((nk) log n)。最坏情况 O(n log n)。