java Java中按值映射自动排序

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7465369/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 20:05:33  来源:igfitidea点击:

Automatically sorted by values map in Java

javadata-structurescollectionsassociative-arraysorted

提问by Alexandros

I need to have an automaticallysorted-by-values map in Java - so that It keeps being sorted at any time while I'm adding new key-value pairs or update the value of an existing key-value pair, or even delete some entry.

我需要在 Java 中有一个自动按值排序的映射 - 以便在我添加新的键值对或更新现有键值对的值,甚至删除一些时,它会随时保持排序入口。

Please also have in mind that this map is going to be really big (100's of thousands, or even 10's of millions of entries in size).

还请记住,这张地图将非常大(大小为 100 万个,甚至是数百万个条目的 10 多个)。

So basically I'm looking for the following functionality:

所以基本上我正在寻找以下功能:

Supposed that we had a class 'SortedByValuesMap' that implements the aforementioned functionality and we have the following code:

假设我们有一个实现上述功能的“SortedByValuesMap”类,我们有以下代码:

SortedByValuesMap<String,Long> sorted_map = new SortedByValuesMap<String, Long>();
sorted_map.put("apples", 4);
sorted_map.put("oranges", 2);
sorted_map.put("bananas", 1);
sorted_map.put("lemons", 3);
sorted_map.put("bananas", 6);

for (String key : sorted_map.keySet()) {
  System.out.println(key + ":" + sorted_map.get(key));
}

the output should be:

输出应该是:

bananas:6
apples:4
lemons:3
oranges:2

In particular, what's really important for me, is to be able to get the entry with the lowest value at any time - using a command like:

特别是,对我来说真正重要的是能够随时获取具有最低值的条目 - 使用如下命令:

smallestItem = sorted_map.lastEntry();

which should give me the 'oranges' entry

这应该给我“橙子”条目

EDIT: I am a Java newbie so please elaborate a bit in your answers - thanks

编辑:我是 Java 新手,所以请在您的答案中详细说明 - 谢谢

EDIT2: This might help: I am using this for counting words (for those who are familiar: n-grams in particular) in huge text files. So I need to build a map where keys are words and values are the frequencies of those words. However, due to limitations (like RAM), I want to keep only the X most frequent words - but you can't know beforehand which are going to be the most frequent words of course. So, the way I thought it might work (as an approximation) is to start counting words and when the map reaches a top-limit (like 1 mil entries) , the least frequent entry will be deleted so as to keep the map's size to 1 mil always.

EDIT2:这可能有帮助:我用它来计算巨大文本文件中的单词(对于那些熟悉的人:特别是 n-gram)。所以我需要构建一个地图,其中键是单词,值是这些单词的频率。但是,由于限制(如 RAM),我只想保留 X 个最常用的词 - 但您当然无法事先知道哪些将是最常用的词。所以,我认为它可能工作的方式(作为近似值)是开始计算单词,当地图达到最高限制(比如 1 百万个条目)时,最不频繁的条目将被删除,以保持地图的大小总是一百万。

回答by Mechanical snail

Keep 2 data structures:

保持2个数据结构:

  • A dictionary of words -> count. Just use an ordinary HashMap<String, Long>.
  • An "array" to keep track of order, such that list[count]holds a Set<String>of words with that count.

    I'm writing this as though it were an array as a notational convenience. In fact, you probably don't know an upper bound on the number of occurrences, so you need a resizable data structure. Implement using a Map<Long, Set<String>>. Or, if that uses too much memory, use an ArrayList<Set<String>>(you'll have to test for count == size() - 1, and if so, use add()instead of set(count + 1)).

  • 单词词典 -> 计数。只需使用普通的HashMap<String, Long>.
  • 一个用于跟踪顺序的“数组”,其中list[count]包含Set<String>具有该计数的单词。

    我写这个就好像它是一个数组作为符号方便。事实上,您可能不知道出现次数的上限,因此您需要一个可调整大小的数据结构。使用Map<Long, Set<String>>. 或者,如果这使用了太多内存,请使用ArrayList<Set<String>>(您必须测试count == size() - 1,如果是,则使用add()代替set(count + 1))。

To increment the number of occurrences for a word (pseudocode):

增加一个词的出现次数(伪代码):

// assumes data structures are in instance variables dict and arr
public void tally(final String word)
{
    final long count = this.dict.get(word) or 0 if absent;
    this.dict.put(word, count + 1);
    // move word up one place in arr
    this.arr[count].remove(word);   // This is why we use a Set: for fast deletion here.
    this.arr[count + 1].add(word);
}

To iterate over words in order (pseudocode):

按顺序迭代单词(伪代码):

for(int count = 0; count < arr.size; count++)
    for(final String word : this.arr[count])
        process(word, count);

回答by NiematojakTomasz

How about using additional index or only TreeMap<Long, TreeSet<String>>or TreeMap<Long, String>if Long values are distinct?

如何使用附加索引或仅使用TreeMap<Long, TreeSet<String>>TreeMap<Long, String>Long 值不同?

You can also write a Heap.

你也可以写一个Heap

回答by user3656845

Try the solution posted on http://paaloliver.wordpress.com/2006/01/24/sorting-maps-in-java/. You have the flexibility of doing sorting ascending or descending too.

尝试发布在http://paaloliver.wordpress.com/2006/01/24/sorting-maps-in-java/上的解决方案。您也可以灵活地进行升序或降序排序。

Here is what they say

这是他们说的

import java.util.Comparator;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.SortedMap;
import java.util.TreeMap;

public class MapValueSort {

    /** inner class to do soring of the map **/
    private static class ValueComparer implements Comparator<String> {
        private Map<String, String>  _data = null;
        public ValueComparer (Map<String, String> data){
            super();
            _data = data;
        }

         public int compare(String o1, String o2) {
             String e1 = (String) _data.get(o1);
             String e2 = (String) _data.get(o2);
             return e1.compareTo(e2);
         }
    }

    public static void main(String[] args){

        Map<String, String> unsortedData = new HashMap<String, String>();
        unsortedData.put("2", "DEF");
        unsortedData.put("1", "ABC");
        unsortedData.put("4", "ZXY");
        unsortedData.put("3", "BCD");


        SortedMap<String, String> sortedData = new TreeMap<String, String>(new MapValueSort.ValueComparer(unsortedData));

        printMap(unsortedData);

        sortedData.putAll(unsortedData);
        System.out.println();
        printMap(sortedData);
    }

    private static void printMap(Map<String, String> data) {
        for (Iterator<String> iter = data.keySet().iterator(); iter.hasNext();) {
            String key = (String) iter.next();
            System.out.println("Value/key:"+data.get(key)+"/"+key);
        }
    }

}

Outputs

输出

Value/key:BCD/3
Value/key:DEF/2
Value/key:ABC/1
Value/key:ZXY/4

Value/key:ABC/1
Value/key:BCD/3
Value/key:DEF/2
Value/key:ZXY/4

回答by David Bleckmann

I found the need of a similar structure to keep a list of objects ordered by associated values. Based on the suggestion from Mechanical snail in this thread, I coded up a basic implementation of such a map. Feel free to use.

我发现需要一个类似的结构来保存按关联值排序的对象列表。根据此线程中机械蜗牛的建议,我编写了此类地图的基本实现。随意使用。

import java.util.*;

/**
 * A map where {@link #keySet()} and {@link #entrySet()} return sets ordered
 * with ascending associated values with respect to the the comparator provided
 * at constuction. The order of two or more keys with identical values is not
 * defined.
 * <p>
 * Several contracts of the Map interface are not satisfied by this minimal
 * implementation.
 */
public class ValueSortedMap<K, V> extends HashMap<K, V> {
    protected Map<V, Collection<K>> valueToKeysMap;

    public ValueSortedMap() {
        this((Comparator<? super V>) null);
    }

    public ValueSortedMap(Comparator<? super V> valueComparator) {
        this.valueToKeysMap = new TreeMap<V, Collection<K>>(valueComparator);
    }

    public boolean containsValue(Object o) {
        return valueToKeysMap.containsKey(o);
    }

    public V put(K k, V v) {
        V oldV = null;
        if (containsKey(k)) {
            oldV = get(k);
            valueToKeysMap.get(oldV).remove(k);
        }
        super.put(k, v);
        if (!valueToKeysMap.containsKey(v)) {
            Collection<K> keys = new ArrayList<K>();
            keys.add(k);
            valueToKeysMap.put(v, keys);
        } else {
            valueToKeysMap.get(v).add(k);
        }
        return oldV;
    }

    public void putAll(Map<? extends K, ? extends V> m) {
        for (Map.Entry<? extends K, ? extends V> e : m.entrySet())
            put(e.getKey(), e.getValue());
    }

    public V remove(Object k) {
        V oldV = null;
        if (containsKey(k)) {
            oldV = get(k);
            super.remove(k);
            valueToKeysMap.get(oldV).remove(k);
        }
        return oldV;
    }

    public void clear() {
        super.clear();
        valueToKeysMap.clear();
    }

    public Set<K> keySet() {
        LinkedHashSet<K> ret = new LinkedHashSet<K>(size());
        for (V v : valueToKeysMap.keySet()) {
            Collection<K> keys = valueToKeysMap.get(v);
            ret.addAll(keys);
        }
        return ret;
    }

    public Set<Map.Entry<K, V>> entrySet() {
        LinkedHashSet<Map.Entry<K, V>> ret = new LinkedHashSet<Map.Entry<K, V>>(size());
        for (Collection<K> keys : valueToKeysMap.values()) {
            for (final K k : keys) {
                final V v = get(k);
                ret.add(new Map.Entry<K,V>() {
                    public K getKey() {
                        return k;
                    }

                    public V getValue() {
                        return v;
                    }

                    public V setValue(V v) {
                        throw new UnsupportedOperationException();
                    }
                });
            }
        }
        return ret;
    }
}

This implementation does not honor all the contracts of the Map interface such as reflecting value changes and removals in the returned key set and entry sets in the actual map, but such a solution would be a bit large to include in a forum like this. Perhaps I will work on one and make it available via github or something similar.

此实现不遵守 Map 接口的所有约定,例如在实际映射中反映返回的键集和条目集中的值更改和删除,但这样的解决方案包含在这样的论坛中会有点大。也许我会研究一个并通过 github 或类似的东西提供它。

回答by u290629

Guava BiMapSolution:

Guava BiMap解决方案:

//Prepare original data
BiMap<String, Integer> biMap = HashBiMap.create();
biMap.put("apples" , 4);
biMap.put("oranges", 2);
biMap.put("bananas", 1);
biMap.put("lemons" , 3);
biMap.put("bananas", 6);

//Create a desc order SortedMap
SortedMap<Integer, String> sortedMap = new TreeMap<Integer, String>(new Comparator<Integer>(){
    @Override public int compare(Integer o1, Integer o2) {
      return o2-o1;
}});

//Put inversed map
sortedMap.putAll(biMap.inverse());
for (Map.Entry<Integer, String> e: sortedMap.entrySet()) {
      System.out.println(e);
}
System.out.println(sortedMap.lastKey()); 

回答by DeepNightTwo

You may refer to the implementation of java.util.LinkedHashMap. The basic idea is, using a inner linked list to store orders. Here is some details:

的实现可以参考java.util.LinkedHashMap。基本思想是,使用内部链表来存储订单。以下是一些细节:

Extends from HashMap. In HashMap, each entry has a key and value, that is basic. You can Add a next and a prev pointer to store entries in order by value. And a header and tail pointer to get the first and last entry. For every modification (add, remove, update), you can add your own code to change the list order. It is no more than a linear search and pointer switch.

从 HashMap 扩展。在 HashMap 中,每个条目都有一个键和值,这是基本的。您可以添加 next 和 prev 指针以按值顺序存储条目。还有一个头部和尾部指针来获取第一个和最后一个条目。对于每次修改(添加、删除、更新),您可以添加自己的代码来更改列表顺序。它只不过是一个线性搜索和指针切换。

Sure it will be slow for add/update if there are too many entries because it is a linked list not array. But as long as the list is sorted, I believe there are lots of ways to speedup the search.

如果条目太多,添加/更新肯定会很慢,因为它是一个链表而不是数组。但是只要对列表进行排序,我相信有很多方法可以加快搜索速度。

So here is what you got: A map that has the same speed with HashMap when retrieving an entry by a key. A linked list which stores entries in order.

所以这就是你得到的:当通过键检索条目时,具有与 HashMap 相同速度的映射。按顺序存储条目的链表。

We can discuss this further if this solution meets your requirement.

如果此解决方案满足您的要求,我们可以进一步讨论。



to jtahlborn: As I said, it surely is slow without any optimization. Since we are talking about performance not impl now, lots of things can be done.

to jtahlborn:正如我所说,如果没有任何优化,它肯定会很慢。由于我们现在谈论的是性能而不是 impl,因此可以做很多事情。

One solution is using a tree instead of Linked List, like Red-Black Tree. Then iterate the tree instead of iterator the map.

一种解决方案是使用树而不是链表,如红黑树。然后迭代树而不是迭代映射。

About the smallest value, it is easier. Just using a member variable to store the smallest, when add or update an element, update the smallest value. When delete, search the tree for the smallest (this is very fast)

关于最小值,它更容易。只需使用一个成员变量来存储最小的,当添加或更新一个元素时,更新最小值。删除时,搜索最小的树(这个速度很快)

if tree is too complex, it is also possible to using another list/array to mark the some positions in the list. for example, maybe 100 element each. Then when search, just search the position list first and then the real list. This list also needs to be maintained, it would be reasonable to recount the position list for certain times of modification, maybe 100.

如果树太复杂,也可以使用另一个列表/数组来标记列表中的某些位置。例如,每个可能有 100 个元素。然后在搜索时,只需先搜索位置列表,然后搜索真正的列表。此列表也需要维护,重新计算某些修改次数的位置列表是合理的,可能是 100。

回答by Micha? ?rajer

Update:You cannot sort maps by values, sorry.

更新:您不能按值对地图进行排序,抱歉。

You can use SortedMapimplementation like TreeMapwith Comparatordefining order by values (instead of default - by keys).

您可以使用SortedMap实现像TreeMapComparator由值定义顺序(而不是默认-通过键)。

Or, even better, you can put elements into a PriorityQueuewith predefined comparator by values. It should be faster and take less memory compared to TreeMap.

或者,更好的是,您可以按值将元素放入具有预定义比较器的PriorityQueue 中。与 TreeMap 相比,它应该更快并且占用更少的内存。

回答by jtahlborn

if all you need is the "min" value, then just use a normal map and keep track of the "min" value anytime it is modified.

如果您只需要“min”值,那么只需使用法线贴图并在修改时随时跟踪“min”值。

EDIT:

编辑:

so, if you really need value ordering and you want to use out-of-the-box solutions, you basically need 2 collections. One normal map (e.g. HashMap), and one SortedSet (e.g. TreeSet>). you can traverse ordered elements via the TreeSet, and find frequencies by key using the HashMap.

所以,如果你真的需要值排序并且你想使用开箱即用的解决方案,你基本上需要 2 个集合。一个法线贴图(例如HashMap)和一个SortedSet(例如TreeSet>)。您可以通过 TreeSet 遍历有序元素,并使用 HashMap 键查找频率。

obviously, you could always code up something yourself sort of like a LinkedHashMap, where the elements are locatable by key and traversable by order, but that's pretty much going to be entirely custom code (i doubt anything that specific already exists, but i could be wrong).

显然,你总是可以自己编写一些类似于 LinkedHashMap 的东西,其中元素可以通过键定位并按顺序遍历,但这几乎是完全自定义的代码(我怀疑任何特定的已经存在,但我可以错误的)。