了解如何在 Python 中创建堆

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12749622/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 11:46:48  来源:igfitidea点击:

Understanding how to create a heap in Python

pythonheap

提问by Sam Hammamy

The collections.Count.most_commonfunction in Python uses the heapqmodule to return the count of the most common word in a file, for instance.

例如collections.Count.most_common,Python 中的函数使用该heapq模块返回文件中最常见单词的计数。

I have traced through the heapq.pyfile, but I'm having a bit of trouble understanding how a heap is created/updated with respect to words let's say.

我已经对heapq.py文件进行了跟踪,但是我在理解堆是如何根据单词创建/更新时遇到了一些麻烦。

So, I think the best way for me to understand it, is to figure out how to create a heap from scratch.

所以,我认为我理解它的最好方法是弄清楚如何从头开始创建一个堆。

Can someone provide a pseudocode for creating a heap that would represent word count?

有人可以提供一个伪代码来创建一个代表字数的堆吗?

采纳答案by Joran Beasley

this is a slightly modified version of the code found here : http://code.activestate.com/recipes/577086-heap-sort/

这是此处找到的代码的略微修改版本:http: //code.activestate.com/recipes/577086-heap-sort/

def HeapSort(A,T):
    def heapify(A):
        start = (len(A) - 2) / 2
        while start >= 0:
            siftDown(A, start, len(A) - 1)
            start -= 1

    def siftDown(A, start, end):
        root = start
        while root * 2 + 1 <= end:
            child = root * 2 + 1
            if child + 1 <= end and T.count(A[child]) < T.count(A[child + 1]):
                child += 1
            if child <= end and T.count(A[root]) < T.count(A[child]):
                A[root], A[child] = A[child], A[root]
                root = child
            else:
                return

    heapify(A)
    end = len(A) - 1
    while end > 0:
        A[end], A[0] = A[0], A[end]
        siftDown(A, 0, end - 1)
        end -= 1


if __name__ == '__main__':
    text = "the quick brown fox jumped over the the quick brown quick log log"
    heap = list(set(text.split()))
    print heap

    HeapSort(heap,text)
    print heap

Output

输出

['brown', 'log', 'jumped', 'over', 'fox', 'quick', 'the']
['jumped', 'fox', 'over', 'brown', 'log', 'the', 'quick']

you can visualize the program here http://goo.gl/2a9Bh

您可以在此处可视化该程序 http://goo.gl/2a9Bh

回答by Hueston Rido

In Python 2.X and 3.x, heaps are supported through an importable library, heapq. It supplies numerous functions to work with the heap data structure modelled in a Python list. Example:

在 Python 2.X 和 3.x 中,通过可导入库 heapq 支持堆。它提供了许多函数来处理在 Python 列表中建模的堆数据结构。例子:

>>> from heapq import heappush, heappop
>>> heap = []
>>> data = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0]
>>> for item in data:
        heappush(heap, item)

>>> ordered = []
>>> while heap:
        ordered.append(heappop(heap))

>>> ordered
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> data.sort()
>>> data == ordered
True

You can find out more about Heap functions: heappush, heappop, heappushpop, heapify, heapreplacein heap python docs.

您可以heappush, heappop, heappushpop, heapify, heapreplaceheap python docs 中找到有关堆函数的更多信息。

回答by slashdottir

Here's another variation based on Sedgewick

这是基于Sedgewick的另一个变体

The heap is represented internally in an array where if a node is at k, it's children are at 2*k and 2*k + 1. The first element of the array is not used, to make the math more convenient.

堆在内部表示在一个数组中,如果节点位于 k,则它的子节点位于 2*k 和 2*k + 1。不使用数组的第一个元素,以使数学更方便。

To add a new element to the heap, you append it to the end of the array and then call swim repeatedly until the new element finds its place in the heap.

要将新元素添加到堆中,请将其附加到数组的末尾,然后重复调用游泳,直到新元素在堆中找到它的位置。

To delete the root, you swap it with the last element in the array, delete it and then call sink until the swapped element finds its place.

要删除根,将它与数组中的最后一个元素交换,删除它,然后调用 sink 直到交换的元素找到它的位置。

swim(k):
  while k > 1 and less(k/2, k):
    exch(k, k/2)
    k = k/2

sink(k):
  while 2*k <= N:
    j = 2*k
    if j < N and less(j, j+1):
      j++
    if not less(k, j):
      break
    exch(k, j)
    k = j

Here's a visualization of heap insert, inserting the first 15 letters of the alphabet: [a-o]

这是堆插入的可视化,插入字母表的前 15 个字母:[ao]

heap insert visualization

堆插入可视化

回答by Amelio Vazquez-Reina

Your confusion may come from the fact that the Python module heapqdoes notdefine a heap as a data type(a class) with its own methods (e.g. as in a dequeor a list). It instead provides functions that you can run on a Python list.

您的混乱可能来自一个事实,即Python模块heapq限定堆作为数据类型与它自己的方法(一类)(例如,如在一个deque或一个list)。相反,它提供了可以在 Python 上运行的函数list

It's best to think of heapqas a module providing a set of algorithms(methods) to interpret lists as heaps and manipulate them accordingly. Note that it's common to represent heapsinternally as arrays(as an abstract data structure), and Python already has lists serving that purpose, so it makes sense for heapqto just provide methods to manipulate lists as heaps.

最好将其heapq视为提供一组算法(方法)的模块,以将列表解释为堆并相应地操作它们。请注意,在内部将堆表示数组(作为抽象数据结构)是很常见的,并且 Python 已经具有用于该目的的列表,因此heapq仅提供将列表操作为堆的方法是有意义的。

Let's see this with an example. Starting with a simple Python list:

让我们用一个例子来看看。从一个简单的 Python 列表开始:

>>> my_list = [2, -1, 4, 10, 0, -20]

To create a heap with heapqfrom my_listwe just need to call heapifywhich simply re-arranges the elements of the list to form a min-heap:

要使用heapqfrom创建堆,my_list我们只需要调用heapify它简单地重新排列列表的元素以形成一个最小堆:

>>> import heapq
>>> # NOTE: This returns NoneType:
>>> heapq.heapify(my_list)

Note that you can still access the list underlying the heap, since all heapifyhas done is change the valuereferenced by my_list:

请注意,您仍然可以访问堆底层的列表,因为heapify所做的只是更改了引用my_list

>>> my_list
[-20, -1, 2, 10, 0, 4]

Popping elements from the heap held by my_list:

从 持有的堆中弹出元素my_list

>>> [heapq.heappop(my_list) for x in range(len(my_list))]
[-20, -1, 0, 2, 4, 10]