java 确定数组中最常见的出现

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1852631/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-29 18:14:38  来源:igfitidea点击:

Determine the most common occurrence in an array

javaalgorithm

提问by Shaitan00

Assume I have an array of doubles that looks like the following:

假设我有一个双精度数组,如下所示:

Array[10] = {10, 10, 10, 3, 10, 10, 6, 10, 10, 9, 10}

I need a function that can determine what the MAJORTY vote is in the array, in this case "10" because it is the number that appears the most often... And of course there is the situation when no majority exists (where they are equal), in that case I need to throw an exception...

我需要一个函数来确定数组中的 MAJORTY 投票是什么,在这种情况下是“10”,因为它是最常出现的数字......当然还有没有多数票存在的情况(它们在哪里相等),在这种情况下我需要抛出异常......

Any clues? Aside from doing some really nasty looping on the array (for each index, determine how many exist with the same value, store a count in the array, and then scan the count array for the highest number and the value at that position is the winner, etc...)

有什么线索吗?除了在数组上做一些非常讨厌的循环(对于每个索引,确定有多少个具有相同的值,在数组中存储一个计数,然后扫描计数数组中的最高数字,该位置的值是赢家, 等等...)

回答by dfa

Using a Map<Integer, Integer>should be simple as:

使用 aMap<Integer, Integer>应该很简单:

int mostFrequent(int... ary) {
    Map<Integer, Integer> m = new HashMap<Integer, Integer>();

    for (int a : ary) {
        Integer freq = m.get(a);
        m.put(a, (freq == null) ? 1 : freq + 1);
    }

    int max = -1;
    int mostFrequent = -1;

    for (Map.Entry<Integer, Integer> e : m.entrySet()) {
        if (e.getValue() > max) {
            mostFrequent = e.getKey();
            max = e.getValue();
        }
    }

    return mostFrequent;
}

回答by Michael Borgwardt

Your first problem is that you have an "array of doubles", because equality is problematic with floating point data (identical numerical values can be represented by different bit patters, among other things). If your doubles are in fact (as in the example) integers, use intinstead. Otherweise, think long and hard about how you define what values are equal for the purpose of representing the same vote.

您的第一个问题是您有一个“双精度数组”,因为浮点数据的相等性存在问题(相同的数值可以由不同的位模式等表示)。如果您的双打实际上(如示例中所示)是整数,请int改用。否则,请仔细考虑如何定义什么值是相同的,以代表相同的投票。

As for determining the majority vote, use a Mapwith the "vote id" as key and the number of votes as value - then in the end iterate through the map to find the maximal value.

至于确定多数票,使用Map以“vote id”为键,以投票数为值的 a - 然后最后遍历地图以找到最大值。

回答by Paul

Sort the array first w/ quick sort and then scan and count for a majority - O(n ln n). If the range of elements is known ahead of time, say between {1,k}, then a counting sort can be used which will run in O(n+k).

首先使用快速排序对数组进行排序,然后扫描并计数为多数 - O(n ln n)。如果提前知道元素的范围,例如在 {1,k} 之间,则可以使用以 O(n+k) 运行的计数排序。

As a slight improvement, as you are scanning the sorted array, if you find value that has more that n/2 occurrences you are done.

作为一个小小的改进,当您扫描已排序的数组时,如果您发现出现次数超过 n/2 的值,您就完成了。

回答by Grizzly

With an array of doubles this might not be easy since equality comparisons on doubles are pretty problematic. If you can get away with using integers, you can do something like the following:

对于双打数组,这可能并不容易,因为双打的相等比较非常有问题。如果您可以避免使用整数,则可以执行以下操作:

    HashMap<Integer, Integer> map = new HashMap<Integer, Integer>();
    for(int element: Array)
    {
        Integer frequency = map.get(element);
        map.put(element, (frequency != null) ? frequency + 1 : 1);      
    }
    int mostFrequentItem  = 0;
    int[] maxFrequencies  = new int[2];
    maxFrequencies[0]     = Integer.MIN_VALUE;

    for(Entry<Integer, Integer> entry: map.entrySet())
    {
        if(entry.getValue()>= maxFrequencies[0])
        {
            mostFrequentItem  = entry.getKey();
            maxFrequencies[1] = maxFrequencies[0];
            maxFrequencies[0] = entry.getValue();
        }
    }
    if(maxFrequencies[1] == maxFrequencies[0])
        throw new Exception();//insert whatever exception seems appropriate
            return mostFrequentItem  

This will have O(n) performance, so it should be pretty optimal in asymptotic performance behaviour. If your doubles are not the results of calculations but come from an other source, that is if you can be sure that values which are basically the same will be represented equally, you might get away with using the same method for doubles, however I would still recommend being careful that this is really the case.

这将具有 O(n) 性能,因此它在渐近性能行为中应该是非常理想的。如果您的双打不是计算结果而是来自其他来源,也就是说,如果您可以确定基本相同的值将被平等地表示,那么您可能不会对双打使用相同的方法,但是我会仍然建议小心,情况确实如此。

Edit: some performance improvements as suggested in the comment as well as supporting checking for ambiguous case

编辑:评论中建议的一些性能改进以及支持检查不明确的情况

回答by Stephen C

As @Grizzly points out, doubles are problematic from a computational standpoint. I would also suggest that they don't make sense from the standpoint of your problem domain; doubles don't make any sense with majority voting!

正如@Grizzly 指出的那样,从计算的角度来看,双打是有问题的。我还建议从您的问题域的角度来看它们没有意义;双打对多数票没有任何意义!

So lets assume that 10and 6and so on are integer identifiers for the things people are voting for. Lets also assume that you know that users can vote any value from 0to 10.

所以,让我们假设106等等都是整数的东西的人都投票支持标识符。还假设您知道用户可以投票从0到 的任何值10

int[] votes = ...
int[] voteCounts = new int[11];  // 11 could be calculated ...
for (int vote : votes) {
    voteCounts[vote]++;
}
int majority = (votes.length + 1) / 2;
for (int i = 0; i < voteCounts.length; i++) {
    if (voteCounts[i] >= majority) {
        return i;  // the winner!
    }
}
throw new NoClearMajorityException(...);

This algorithm is O(N)in time and O(M)in space, where M is the largest identifier. The catch is that it only works (as written) if the identifiers are integers.

该算法O(N)在时间和O(M)空间上,其中 M 是最大的标识符。问题是它只有在标识符是整数时才有效(如所写)。

回答by Kametrixom

I just created such a beautiful and small solution with the new Java 8:

我刚刚用新的 Java 8 创建了一个如此漂亮而小巧的解决方案:

import java.util.Arrays;
import java.util.Collection;
import java.util.HashMap;
import java.util.Map;

public class MostCommonObject {
    public static void main(String[] args) {
        System.out.println(mostCommonObject(new Integer[] { -4, 1, -2, 3, 1, -2, 3, 1 }));
    }

    public static <T> T mostCommonObject(T[] array) {
        return mostCommonObject(Arrays.asList(array));
    }

    public static <T> T mostCommonObject(Collection<T> collection) {
        Map<T, Integer> map = new HashMap<>();
        collection.forEach(t -> map.compute(t, (k, i) -> i == null ? 1 : i + 1));
        return map.entrySet().stream().max((e1, e2) -> Integer.compare(e1.getValue(), e2.getValue())).get().getKey();
    }
}

回答by Santhanam

Try This one,

试试这个,

    Integer[] array=new Integer[]{10, 10, 10, 3, 10, 10, 6, 10, 10, 9, 10};

    List<Integer> demoList=new ArrayList<Integer>(Arrays.asList(array));

    Set<Integer> set=new HashSet<Integer>(demoList);

    Map<Integer,Integer> myMap=new HashMap<Integer, Integer>();

    for (Integer integer : set)
    {
        int count=Collections.frequency(demoList, integer);
        myMap.put(count, integer);            
    }

    int maxOccurance=myMap.get(Collections.max(myMap.keySet()));

回答by Mia Clarke

You could do this: Convert your array to a list and sort it. Pick the first index, and call lastIndexOf(obj) on the value. Do this for each new value you encounter, calculate the range of the value and store the results of the biggest range in a variable.

您可以这样做:将数组转换为列表并对其进行排序。选择第一个索引,并对该值调用 lastIndexOf(obj)。对遇到的每个新值执行此操作,计算值的范围并将最大范围的结果存储在变量中。

回答by Esko

What you really want to do is to count the occurrences of certain items in given set. In fact this was previously asked less than a day ago, you might want to look into this very relevant question.

您真正想要做的是计算给定集合中某些项目的出现次数。事实上,这是在不到一天前被问到的,您可能想研究一下这个非常相关的问题