Java,从数组中找到第 K 个最大值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32324048/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 20:00:47  来源:igfitidea点击:

Java, Finding Kth largest value from the array

javaarraysalgorithmmin-heap

提问by Hesam

I had an interview with Facebook and they asked me this question.

我接受了 Facebook 的采访,他们问了我这个问题。

Suppose you have an unordered array with N distinct values

$input = [3,6,2,8,9,4,5]

Implement a function that finds the Kth largest value.

EG: If K = 0, return 9. If K = 1, return 8.

假设您有一个包含 N 个不同值的无序数组

$input = [3,6,2,8,9,4,5]

实现一个查找第 K 个最大值的函数。

EG:如果 K = 0,则返回 9。如果 K = 1,则返回 8。

What I did was this method.

我做的就是这个方法。

private static int getMax(Integer[] input, int k)
{
    List<Integer> list = Arrays.asList(input);
    Set<Integer> set = new TreeSet<Integer>(list);

    list = new ArrayList<Integer>(set);
    int value = (list.size() - 1) - k;

    return list.get(value);
}

I just tested and the method works fine based on the question. However, interviewee said, in order to make your life complex! lets assume that your array contains millions of numbers then your listing becomes too slow. What you do in this case?As hint, he suggested to use min heap. Based on my knowledge each child value of heap should not be more than root value. So, in this case if we assume that 3 is root then 6 is its child and its value is grater than root's value. I'm probably wrong but what you think and what is its implementation based on min heap?

我刚刚测试过,根据问题,该方法工作正常。然而,受访者表示,in order to make your life complex! lets assume that your array contains millions of numbers then your listing becomes too slow. What you do in this case?作为暗示,他建议使用min heap. 根据我的知识,堆的每个子值不应超过根值。所以,在这种情况下,如果我们假设 3 是根,那么 6 是它的孩子,它的值大于根的值。我可能错了,但是您的想法以及它的实现基于min heap什么?

回答by Codebender

He has actually given you the whole answer. Not just a hint.

他实际上已经给了你完整的答案。不仅仅是提示。

And your understanding is based on max heap. Not min heap. And it's workings are self-explanatory.

而你的理解是基于max heap. 不是min heap。它的工作原理是不言自明的。

In a min heap, the root has the minimum(less than it's children) value.

min heap 中,根具有最小(小于它的子代)值。

So, what you need is, iterate over the array and populate Kelements in min heap. Once, it's done, the heap automatically contains the lowest at the root.

所以,您需要的是,遍历数组并Kmin heap 中填充元素。一旦完成,堆将自动包含根最低的。

Now, for each (next)element you read from the array, -> check if the value is greater than root of min heap. -> If yes, remove root from min heap, and add the value to it.

现在,对于您从数组中读取的每个(下一个)元素,-> 检查该值是否大于最小堆的根。-> 如果是,从最小堆中删除根,并将值添加到它。

After you traverse your whole array, the root of min heapwill automtically contain the kth largest element.

遍历整个数组后,最小堆的根将自动包含第kth 个最大元素。

And all other elements (k-1 elements to be precise) in the heap will be larger than k.

堆中的所有其他元素(准确地说是 k-1 个元素)都将大于k

回答by YoungHobbit

Here is the implementation of the Min Heapusing PriorityQueuein java. Complexity:n * log k.

下面是在java中使用PriorityQueue实现Min Heap复杂度:n * log k

import java.util.PriorityQueue;

public class LargestK {

  private static Integer largestK(Integer array[], int k) {
    PriorityQueue<Integer> queue = new PriorityQueue<Integer>(k+1);
    int i = 0;
    while (i<=k) {
      queue.add(array[i]);
      i++;
    }
    for (; i<array.length; i++) {
      Integer value = queue.peek();
      if (array[i] > value) {
        queue.poll();
        queue.add(array[i]);
      }
    }
    return queue.peek();
  }

  public static void main(String[] args) {
    Integer array[] = new Integer[] {3,6,2,8,9,4,5};
    System.out.println(largestK(array, 3));
  }
}

Output: 5

输出:5

The code loop over the array which is O(n). Size of the PriorityQueue (Min Heap) is k, so any operation would be log k. In the worst case scenario, in which all the number are sorted ASC, complexity is n*log k, because for each element you need to remove top of the heap and insert new element.

代码循环遍历数组,即O(n)。PriorityQueue(最小堆)的大小是 k,所以任何操作都是log k。在最坏的情况下,所有数字都按ASC排序,复杂度为n*log k,因为对于每个元素,您需要删除堆顶部并插入新元素。

回答by akhil_mittal

Edit:Check this answerfor O(n) solution.

编辑:检查O(n) 解决方案的这个答案

You can probably make use of PriorityQueueas well to solve this problem:

您也可以使用PriorityQueue来解决这个问题:

public int findKthLargest(int[] nums, int k) {
        int p = 0;
        int numElements = nums.length;
        // create priority queue where all the elements of nums will be stored
        PriorityQueue<Integer> pq = new PriorityQueue<Integer>();

        // place all the elements of the array to this priority queue
        for (int n : nums){
            pq.add(n);
        }

        // extract the kth largest element
        while (numElements-k+1 > 0){
            p = pq.poll();
            k++;
        }

        return p;
    }

From the Java doc:

从 Java文档

Implementation note: this implementation provides O(log(n))time for the enqueing and dequeing methods (offer, poll, remove()and add); linear time for the remove(Object)and contains(Object)methods; and constant time for the retrieval methods (peek, element, and size).

实现说明:该实现为入队和出队方法(、、和 )提供了O(log(n))时间;和 方法的线性时间;和恒定时间检索方法(, ,和)。offerpollremove()addremove(Object)contains(Object)peekelementsize

The for loop runs ntimes and the complexity of the above algorithm is O(nlogn).

for 循环的运行n次数和上述算法的复杂度为O(nlogn)

回答by rai.skumar

Heap based solution is perfect if the number of elements in array/stream is unknown. But, what if they are finite but still you want an optimized solution in linear time.

如果数组/流中的元素数量未知,则基于堆的解决方案是完美的。但是,如果它们是有限的,但您仍然希望在线性时间内得到优化的解决方案,该怎么办。

We can use Quick Select, discussed here.

我们可以使用此处讨论的快速选择。

Array = [3,6,2,8,9,4,5]

数组 = [3,6,2,8,9,4,5]

Let's chose the pivot as first element:

让我们选择枢轴作为第一个元素:

pivot = 3 (at 0th index),

枢轴 = 3(在第 0 个索引处),

Now partition the array in such a way that all elements less than or equal are on left side and numbers greater than 3 on right side. Like it's done in Quick Sort (discussed on my blog).

现在以这样的方式对数组进行分区,即所有小于或等于的元素都在左侧,而大于 3 的数字在右侧。就像在快速排序中完成的一样(在我的博客上讨论过)。

So after first pass - [2,3,6,8,9,4,5]

所以在第一遍之后 - [2, 3,6,8,9,4,5]

pivot index is 1 (i.e it's the second lowest element). Now apply the same process again.

枢轴索引为 1(即它是第二低的元素)。现在再次应用相同的过程。

chose, 6 now, the value at index after previous pivot - [2,3,4,5,6,8,9]

现在选择 6,上一个枢轴后索引处的值 - [2,3,4,5, 6,8,9]

So now 6 is at the proper place.

所以现在 6 在正确的位置。

Keep checking if you have found the appropriate number (kth largest or kth lowest in each iteration). If it's found you are done else continue.

继续检查您是否找到了合适的数字(每次迭代中第 k 大或第 k 小)。如果找到了,你就完成了,否则继续。

回答by njzk2

One approach for constant values of kis to use a partial insertion sort.

常量值的一种方法k是使用部分插入排序。

(This assumes distinct values, but can easily be altered to work with duplicates as well)

(这假设了不同的值,但也可以轻松更改以处理重复项)

last_min = -inf
output = []
for i in (0..k)
    min = +inf
    for value in input_array
        if value < min and value > last_min
            min = value
    output[i] = min
print output[k-1]

(That's pseudo code, but should be easy enough to implement in Java).

(这是伪代码,但应该很容易用 Java 实现)。

The overall complexity is O(n*k), which means it works pretty well if and only if kis constant or known to be less that log(n).

总体复杂性是O(n*k),这意味着当且仅当它k是恒定的或已知小于该值时,它才能很好地工作log(n)

On the plus side, it is a really simple solution. On the minus side, it is not as efficient as the heap solution

从好的方面来说,这是一个非常简单的解决方案。不利的一面是,它不如堆解决方案有效