C++ 我是否正确实施了“Heapify”算法?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/8130177/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 18:03:59  来源:igfitidea点击:

Am I implementing the "Heapify" Algorithm correctly?

c++algorithm

提问by Chris De Bow

I'm creating a heap implementation for a computer science class, and I was wondering if the following recursive function would create a heap out of an array object that was not already a heap. the code is as follows:

我正在为计算机科学课程创建一个堆实现,我想知道以下递归函数是否会从一个还不是堆的数组对象中创建一个堆。代码如下:

void Heap::Heapify(int i)
{
    int temp, l, r, heapify;
    l = LeftChild(i);// get the left child
    r = RightChild(i);// get the right child

    //if one of the children is bigger than the index
    if((Data[i] < Data[l]) || (Data[i]< Data[r]))
    {
        //if left is the bigger child
        if(Data[l] > Data[r])
        {
            //swap parent with left child
            temp = Data[i];
            Data[i] = Data[l];
            Data[l] = temp;
            heapify = l; // index that was swapped
        }
        //if right is the bigger child
        else
        {
            //swap parent with right child
            temp = Data[i];
            Data[i] = Data[r];
            Data[r] = temp;
            heapify = r; // index that was swapped
        }
        // do a recursive call with the index
        //that was swapped
        Heapify(heapify);
    }
}

the idea is that you see if the data at the index given is bigger than all of it's children. If it is, the function ends no problem. Otherwise, it check to see which is biggest(left or right children), and then swaps that with the index. The heapify is then called at the index where the swapping happened.

这个想法是,您可以查看给定索引处的数据是否大于所有子项。如果是,则函数结束没有问题。否则,它会检查哪个是最大的(左孩子或右孩子),然后将其与索引交换。然后在发生交换的索引处调用 heapify。

by ildjarn's request, I'm including my full class definition and implementation files to aid in the answering of my question:
here's the header file:

应 ildjarn 的要求,我包含了我的完整类定义和实现文件,以帮助回答我的问题:
这是头文件:

#ifndef HEAP_H
#define HEAP_H
//Programmer: Christopher De Bow
//Date: november 15, 2011

class Heap
{ 
private:
    int Data [100];
    int Parent(int);
    int RightChild(int);
    int LeftChild(int);
    void Heapify(int);
    void BuildHeap();

public:
    Heap();
    void insert();
    void HeapSort();
    void ExtractMaximum();
    int Maximum();
    void PrintHeap();
    int heapsize;
    void SetData(int[]);
};

#endif

and the implementation file:

和实现文件:

#include <iostream>
#include "Heap.h"
using namespace std;
//Programmer: Christopher De Bow
//Date: november 15, 2011

Heap::Heap()
{
    int init [10] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
    heapsize = 10;
    SetData(init);
}

int Heap::Parent(int index)
{
    int Rval;
    if(index%2 == 0)// if the index is even
    {
        Rval = ((index-1)/2);
    }
    else// if the index is odd
    {
        Rval = (index/2);
    }
    return Rval;
}

int Heap::RightChild(int arrplace)
{
    int ret;
    ret = ((2*arrplace)+2); //rightchild is index times 2 plus 2
    return ret;
}

int Heap::LeftChild(int i)
{
    int rval;
    rval = ((2*i)+1); //leftchild is index times 2 plus 1
    return rval;
}

void Heap::Heapify(int i)
{
    int temp, l, r, heapify;

    l = LeftChild(i); // get the left child
    r = RightChild(i); // get the right child

    if((l <= heapSize) && (data[l] > data[i]))
    {
        heapify = l;
    {
    else
    {
        heapfiy = i;
    }
    if((r <= heapSize) && (data[r] > data[heapify]))
    {
        heapify = r;
    }
    if(heapify != i) // one of the two child nodes has proved
    {                // larger than Data[i], so interchange values
        //swap parent with left child
        temp = Data[i];
        Data[i] = Data[heapify];
        Data[heapify] = temp;
        Heapify(heapify);
    }
}

void Heap::BuildHeap()
{
    // we do not have a heap
    // we will make a heap
    // by calling heapify starting at the lowest
    // internal node in the heap
    for(int i = heapsize; i >= 1; i--)
    {
        Heapify(i-1);
    }
}

void Heap::insert()
{
    int insert;
    heapsize = (heapsize + 1);
    //getting data from the user
    cout<<"what data would you like to insert?"<<endl;
    cin>>insert;
    Data[heapsize] = insert;
    BuildHeap(); //call BuildHeap on array
    cout<<"done"<<endl;
}

void Heap::PrintHeap()
{
    BuildHeap();
    for(int count = 0; count < (heapsize-1); count++)
    {
        cout<<Data[count];// print out every element in heap
    }
    cout<<endl<<endl;
}

void Heap::HeapSort()
{
    BuildHeap();
    int temp;
    // do this for every elem in heap:
    for(int i = 0; i < heapsize; i++)
    {
        temp = Data[heapsize-1];
        Data[heapsize-1] = Data[0];
        Data[0] = temp;
        heapsize--;
        BuildHeap();
    }
    PrintHeap();
}

void Heap::ExtractMaximum()
{
    BuildHeap();
    //assign last thing in heap to first thing in heap
    Data[0] = Data[heapsize];
    heapsize --; // decrease heapsize by one
    Heapify(0); // heapify from the top
}

int Heap::Maximum()
{
    int Rval;
    BuildHeap();// make sure we have a heap
    Rval = Data[0];
    return Rval; // return top thing
}

//initialize the elements in the "Data" array
void Heap::SetData(int x[])
{
    for(int i = 0; i <= (heapsize); i++)
    {
        Data[i] = x[i]; 
    }
}

采纳答案by Literati Insolitus

Your algorithm works. The problem is in the translation of algorithm to code. Say you declared Data as :

你的算法有效。问题在于算法到代码的翻译。假设您将 Data 声明为:

int Data[7];

and you populate it with the initial values {0, 1, 2, 3, 4, 5, 6}. Presuming definitions of LeftChild(i)and RightChild(i)to be something like:

然后用初始值填充它{0, 1, 2, 3, 4, 5, 6}。假设的定义LeftChild(i)RightChild(i)是这样的:

#define LeftChild(i) ((i << 1) + 1)
#define RightChild(i) ((i << 1) + 2)

then your function BuildHeap(), which should be something like:

那么你的函数BuildHeap(),应该是这样的:

void Heap::BuildHeap()
{
    for(int i = (7 >> 1); i >= 1; i--) // in general, replace 7 with 
                                       // (sizeof(Data)/sizeof(int)), presuming 
                                       // you have an array of int's. if not, 
                                       // replace int with the relevant data type
    Heapify(i-1);
}

will begin the Heapifyprocess on the lower-right-most sub-tree root. In this case, this is array index 2, with a left child of 5 and a right child of 6. Heapifywill correctly exchange 2 and 6 and recursively call Heapify(6).

Heapify在最右下方的子树根上开始该过程。在这种情况下,这是数组索引 2,左孩子为 5,右孩子为 6。Heapify将正确交换 2 和 6 并递归调用Heapify(6).

Here the whole thing can run aground! At present your tree looks like :

在这里,整件事都可能搁浅!目前你的树看起来像:

                         0
                    1         2
                 3     4   5     6
            u n d e f i n e d  s p a c e

so the call Heapify(6)will dutifully compare the values of Data[6]with Data[13]and Data[14](the perils of C++ and its lack of array boundaries enforcement, unlike Java). Obviously, the latter two values can be any junk left in RAM. One solution here, ugly but a working patch, is to add 8 elements in the declaration of Data and initialize them all to some value lower than any element of the array. The better solution is to add a heapSizevariable to your class and set it equal to the length of your array:

因此该调用Heapify(6)将尽职地比较Data[6]withData[13]和的值Data[14](C++ 的危险及其缺乏数组边界强制执行,与 Java 不同)。显然,后两个值可以是留在 RAM 中的任何垃圾。这里的一个解决方案,丑陋但有效的补丁,是在 Data 的声明中添加 8 个元素,并将它们全部初始化为低于数组中任何元素的某个值。更好的解决方案是向类中添加一个heapSize变量并将其设置为等于数组的长度:

heapSize = (sizeof(Data)/sizeof(int));

Then integrate logic to only compare child nodes if they are valid leaves of the tree. An efficient implementation of this is :

然后集成逻辑以仅比较子节点,如果它们是树的有效叶子。一个有效的实现是:

void Heap::Heapify(int i)
{
    int temp, l, r, heapify;

    l = LeftChild(i); // get the left child
    r = RightChild(i); // get the right child

    if((l <= heapSize) && (Data[l] > Data[i]))
        heapify = l;
    else heapfiy = i;
    if((r <= heapSize) && (Data[r] > Data[heapify]))
        heapify = r;
    if(heapify != i) // one of the two child nodes has proved 
                     // larger than Data[i], so interchange values
    {
        //swap parent with left child
        temp = Data[i];
        Data[i] = Data[heapify];
        Data[heapify] = temp;

        Heapify(heapify);
    }
}

So to summarize, the solution is as straightforward as adding logic to make sure the child nodes are valid leaves of the tree, and your main function will have something like :

总而言之,该解决方案与添加逻辑以确保子节点是树的有效叶子一样简单,并且您的主要功能将具有以下内容:

Heap heap;

// initialize Data here    

heap.BuildHeap();

Hope that helps.

希望有帮助。

回答by Per

No. On the tree

不。在树上

      1
     / \
    /   \
   /     \
  2       3
 / \     / \
6   7   4   5

the output is going to be

输出将是

      3
     / \
    /   \
   /     \
  2       5
 / \     / \
6   7   4   1

which has several heap violations. (I'm assuming that Data[l]and Data[r]are minus infinity if the corresponding children do not exist. You may need extra logic to ensure this.)

它有几个堆违规。(如果相应的孩子不存在,我假设Data[l]Data[r]是负无穷大。您可能需要额外的逻辑来确保这一点。)

What your function does is fix a tree that may not be a heap but whose left and right subtrees are heaps. You need to call it on every node, in postorder (i.e., for i from n - 1 down to 0) so that the children of i are heaps when Heapify(i) is called.

您的函数所做的是修复一棵可能不是堆但其左右子树是堆的树。您需要在每个节点上按后序调用它(即,对于 i 从 n - 1 到 0),以便在调用 Heapify(i) 时 i 的孩子是堆。

回答by Literati Insolitus

Your code now successfully builds a heap. There was only one conceptual flaw : the rest were off-by-one indexing errors. The one fundamental error was in BuildHeap : you had

您的代码现在成功构建了一个堆。只有一个概念上的缺陷:其余的都是一对一的索引错误。一个基本错误是在 BuildHeap 中:你有

for(int i = heapSize; i >= 1; i--)
{
    Heapify(i-1);
}

whereas this should be

而这应该是

for(int i = (heapSize / 2); i >= 1; i--)
{
    Heapify(i-1);
}

This is really important, you must see that Heapify is always called on a tree root, and (this is really cool) you can easily find the last tree root in the array at the index ((heapSize/2) - 1) (this is for C++ and Java style where the first index == 0). The way it was written your code called Heapify on the last leafof the tree, which is in error.

这真的很重要,你必须看到 Heapify 总是在树根上调用,并且(这真的很酷)你可以很容易地在索引 ((heapSize/2) - 1) 处找到数组中的最后一个树根(这适用于 C++ 和 Java 风格,其中第一个索引 == 0)。它的编写方式是在树的最后一片叶子上调用 Heapify 的代码,这是错误的。

Other than that, I added comments to flag the off-by-one errors. I placed them flush left so you can easily find them. Hope you get a suberb understanding of algorithms and data structures! :-)

除此之外,我添加了注释来标记一对一错误。我把它们平放在左边,这样你就可以很容易地找到它们。希望您对算法和数据结构有深入的了解!:-)

Your header file :

你的头文件:

#ifndef HEAP_H
#define HEAP_H
//Programmer: Christopher De Bow
//Date: november 15, 2011

class Heap
{
private:
    int Data [100];
    int Parent(int);
    int RightChild(int);
    int LeftChild(int);
    void Heapify(int);
    void BuildHeap();
// SO added heapSize
    int heapSize;

public:
    Heap();
    void insert();
    void HeapSort();
    void ExtractMaximum();
    int Maximum();
    void PrintHeap();
    int heapsize;
    void SetData(int[]);
};

#endif

Your cpp file :

您的 cpp 文件:

#include <iostream>
#include "Heap.h"
using namespace std;
//Programmer: Christopher De Bow
//Date: november 15, 2011

Heap::Heap()
{
    int init [10] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
    heapSize = 10;
    SetData(init);
}

int Heap::Parent(int index)
{
    int Rval;
    if(index%2 == 0)// if the index is even
    {
        Rval = ((index-1)/2);
    }
    else// if the index is odd
    {
        Rval = (index/2);
    }
    return Rval;
}

int Heap::RightChild(int arrplace)
{
    int ret;
    ret = ((2*arrplace)+2); //rightchild is index times 2 plus 2
    return ret;
}

int Heap::LeftChild(int i)
{
    int rval;
    rval = ((2*i)+1); //leftchild is index times 2 plus 1
    return rval;
}

void Heap::Heapify(int i)
{
    int temp, l, r, heapify;

    l = LeftChild(i); // get the left child
    r = RightChild(i); // get the right child

// you have to compare the index to (heapSize - 1) because we are working
// with C++ and the first array index is 0 : l and r are direct indices
// into the array, so the maximum possible index is the heapSize'th 
// element, which is at heapSize-1. this was kind of nasty as it let the 
// heapify index get too large and led to a swap with memory beyond the
// last element of the array (again, C++ doesn't enforce array boundaries
// as Java does).
    if((l <= (heapSize-1)) && (Data[l] > Data[i]))
        heapify = l;
    else
        heapify = i;
// you have to compare the index to (heapSize - 1) because we are working
// with C++ and the first array index is 0 : l and r are direct indices
// into the array, so the maximum possible index is the heapSize'th 
// element, which is at heapSize-1. this was kind of nasty as it let the 
// heapify index get too large and led to a swap with memory beyond the
// last element of the array (again, C++ doesn't enforce array boundaries
// as Java does).
    if((r <= (heapSize-1)) && (Data[r] > Data[heapify]))
        heapify = r;
    if(heapify != i) // one of the two child nodes has proved
    {                // larger than Data[i], so interchange values
        //swap parent with left child
        temp = Data[i];
        Data[i] = Data[heapify];
        Data[heapify] = temp;
        Heapify(heapify);
    }
}

void Heap::BuildHeap()
{
    // we do not have a heap
    // we will make a heap
    // by calling heapify starting at the lowest
    // internal node in the heap
// i must be initialized to (heapsize/2), please see my 
// post for an explanation
    for(int i = heapSize/2; i >= 1; i--)
    {
        Heapify(i-1);
    }
}

void Heap::insert()
{
    int insert;
    heapSize = (heapSize + 1);
    //getting data from the user
    cout<<"what data would you like to insert?"<<endl;
    cin>>insert;
    Data[heapSize] = insert;
    BuildHeap(); //call BuildHeap on array
    cout<<"done"<<endl;
}

void Heap::PrintHeap()
{
    BuildHeap();
// the array indices are from 0 through (heapSize-1), so 
// count must be less than _or equal to_ (heapSize-1). another
// way of phrasing this (which i applied in this function)
// is (count < heapSize). you'll get better boundary conditions
// with practice.
    for(int count = 0; count < heapSize; count++)
    {
// added an endl to the output for clarity
        cout << Data[count] << endl;// print out every element in heap
    }
    cout<<endl<<endl;
}

void Heap::HeapSort()
{
    BuildHeap();
    int temp;
    // do this for every elem in heap:
    for(int i = 0; i < heapSize; i++)
    {
        temp = Data[heapSize-1];
        Data[heapSize-1] = Data[0];
        Data[0] = temp;
        heapSize--;
        BuildHeap();
    }
    PrintHeap();
}

void Heap::ExtractMaximum()
{
    BuildHeap();
    //assign last thing in heap to first thing in heap
    Data[0] = Data[heapSize];
    heapSize--; // decrease heapSize by one
    Heapify(0); // heapify from the top
}

int Heap::Maximum()
{
    int Rval;
    BuildHeap();// make sure we have a heap
    Rval = Data[0];
    return Rval; // return top thing
}

//initialize the elements in the "Data" array
void Heap::SetData(int x[])
{
// the array indices are from 0 through (heapSize-1), so 
// count must be less than _or equal to_ (heapSize-1). another
// way of phrasing this (which i applied in this function)
// is (i < heapSize). you'll get better boundary conditions
// with practice.
    for(int i = 0; i < heapSize; i++)
    {
        Data[i] = x[i];
    }
}

// basic confirmation function
int main()
{
    Heap heap;
    heap.PrintHeap();

    return 0;
}

回答by sarnold

Your code as written here sure feelsright; but there's nothing quite like writing a few test cases to see how it performs. Be sure to test against a heap with 1, 2, 3, 4, and dozens of elements. (I expect the base case to be where this piece falls short -- how does it handle when ihas no children?. Testing on small heaps ought to show in a hurry.)

这里写你的代码肯定觉得正确的; 但是没有什么比编写一些测试用例来查看它的执行情况更像的了。请务必针对具有 1、2、3、4 和数十个元素的堆进行测试。(我希望基本情况是这件作品的不足之处——它在i没有孩子的情况下如何处理?。对小堆的测试应该很快显示出来。)

Some small advice for this piece:

对这篇文章的一些小建议:

if(Data[l] > Data[r])
{
    //swap parent with left child
    temp = Data[i];
    Data[i] = Data[l];
    Data[l] = temp;
    heapify = l; // index that was swapped
}
//if right is the bigger child
else
    { //swap parent with right child
    temp = Data[i];
    Data[i] = Data[r];
    Data[r] = temp;
    heapify = r; // index that was swapped
}

You could probably gain some legibility by setting only the indexin the ifblocks:

您可能可以通过仅设置块中的索引来获得一些可读性if

if(Data[l] > Data[r]) {
    swapme = l;
} else {
    swapme = r;
}

temp = Data[i];
Data[i] = Data[swapme];
Data[swapme] = temp;
heapify = swapme;