java ConcurrentHashMap：避免使用“putIfAbsent”创建额外的对象？

Question

提问by Gene Golovchinsky

I am aggregating multiple values for keys in a multi-threaded environment. The keys are not known in advance. I thought I would do something like this:

我正在多线程环境中为键聚合多个值。密钥是事先不知道的。我以为我会做这样的事情：

class Aggregator {
    protected ConcurrentHashMap<String, List<String>> entries =
                            new ConcurrentHashMap<String, List<String>>();
    public Aggregator() {}

    public void record(String key, String value) {
        List<String> newList =
                    Collections.synchronizedList(new ArrayList<String>());
        List<String> existingList = entries.putIfAbsent(key, newList);
        List<String> values = existingList == null ? newList : existingList;
        values.add(value);
    }
}

The problem I see is that every time this method runs, I need to create a new instance of an ArrayList, which I then throw away (in most cases). This seems like unjustified abuse of the garbage collector. Is there a better, thread-safe way of initializing this kind of a structure without having to synchronizethe recordmethod? I am somewhat surprised by the decision to have the putIfAbsentmethod not return the newly-created element, and by the lack of a way to defer instantiation unless it is called for (so to speak).

我看到的问题是，每次运行此方法时，我都需要创建一个的新实例，ArrayList然后将其丢弃（在大多数情况下）。这似乎是对垃圾收集器的无理滥用。有没有更好的、线程安全的方法来初始化这种结构而不必使用synchronize该record方法？我对让putIfAbsent方法不返回新创建的元素的决定感到有些惊讶，并且缺乏一种方法来推迟实例化，除非它被调用（可以这么说）。

Answer 1

回答by Bohemian

Java 8 introduced an API to cater for this exact problem, making a 1-line solution:

Java 8 引入了一个 API 来解决这个确切的问题，提出了一个 1-line 解决方案：

public void record(String key, String value) {
    entries.computeIfAbsent(key, k -> Collections.synchronizedList(new ArrayList<String>())).add(value);
}

For Java 7:

对于 Java 7：

public void record(String key, String value) {
    List<String> values = entries.get(key);
    if (values == null) {
        entries.putIfAbsent(key, Collections.synchronizedList(new ArrayList<String>()));
        // At this point, there will definitely be a list for the key.
        // We don't know or care which thread's new object is in there, so:
        values = entries.get(key);
    }
    values.add(value);
}

This is the standard code pattern when populating a ConcurrentHashMap.

这是填充ConcurrentHashMap.

The special method putIfAbsent(K, V))will either put your value object in, or if another thread got before you, then it will ignore your value object. Either way, after the call to putIfAbsent(K, V)), get(key)is guaranteed to be consistent between threads and therefore the above code is threadsafe.

特殊方法putIfAbsent(K, V))要么将您的值对象放入，要么如果另一个线程在您之前，则它将忽略您的值对象。无论哪种方式，在调用之后putIfAbsent(K, V))，get(key)都可以保证线程之间的一致性，因此上面的代码是线程安全的。

The only wasted overhead is if some other thread adds a new entry at the same time for the same key: You mayend up throwing away the newly created value, but that only happens if there is not already an entry andthere's a race that your thread loses, which would typically be rare.

唯一浪费的开销是，如果某个其他线程同时为同一个键添加了一个新条目：您最终可能会丢弃新创建的值，但只有在还没有条目并且您的竞争中存在竞争时才会发生这种情况线程丢失，这通常很少见。

Answer 2

回答by Peter

As of Java-8 you can create Multi Maps using the following pattern:

从 Java-8 开始，您可以使用以下模式创建多映射：

public void record(String key, String value) { entries.computeIfAbsent(key, k -> Collections.synchronizedList(new ArrayList<String>())) .add(value); }

The ConcurrentHashMap documentation (not the general contract) specifies that the ArrayList will only be created once for each key, at the slight initial cost of delaying updates while the ArrayList is being created for a new key:

ConcurrentHashMap 文档（不是一般合同）指定 ArrayList 只会为每个键创建一次，在为新键创建 ArrayList 时延迟更新的初始成本很小：

http://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentHashMap.html#computeIfAbsent-K-java.util.function.Function-

Answer 3

回答by Gene Golovchinsky

In the end, I implemented a slight modification of @Bohemian's answer. His proposed solution overwrites the valuesvariable with the putIfAbsentcall, which creates the same problem I had before. The code that seems to work looks like this:

最后，我对@Bohemian 的回答稍作修改。他提出的解决方案values用putIfAbsent调用覆盖了变量，这产生了我以前遇到的同样问题。似乎有效的代码如下所示：

    public void record(String key, String value) {
        List<String> values = entries.get(key);
        if (values == null) {
            values = Collections.synchronizedList(new ArrayList<String>());
            List<String> values2 = entries.putIfAbsent(key, values);
            if (values2 != null)
                values = values2;
        }
        values.add(value);
    }

It's not as elegant as I'd like, but it's better than the original that creates a new ArrayListinstance at every call.

它不像我想要的那么优雅，但它比ArrayList每次调用都创建一个新实例的原始方法要好。

Answer 4

回答by fracca

Created two versions based on Gene's answer

根据 Gene 的回答创建了两个版本

public  static <K,V> void putIfAbsetMultiValue(ConcurrentHashMap<K,List<V>> entries, K key, V value) {
    List<V> values = entries.get(key);
    if (values == null) {
        values = Collections.synchronizedList(new ArrayList<V>());
        List<V> values2 = entries.putIfAbsent(key, values);
        if (values2 != null)
            values = values2;
    }
    values.add(value);
}

public  static <K,V> void putIfAbsetMultiValueSet(ConcurrentMap<K,Set<V>> entries, K key, V value) {
    Set<V> values = entries.get(key);
    if (values == null) {
        values = Collections.synchronizedSet(new HashSet<V>());
        Set<V> values2 = entries.putIfAbsent(key, values);
        if (values2 != null)
            values = values2;
    }
    values.add(value);
}

It works well

它运作良好

Answer 5

回答by Utku ?zdemir

This is a problem I also looked for an answer. The method putIfAbsentdoes not actually solve the extra object creation problem, it just makes sure that one of those objects doesn't replace another. But the race conditions among threads can cause multiple object instantiation. I could find 3 solutions for this problem (And I would follow this order of preference):

这是一个问题，我也在寻找答案。该方法putIfAbsent实际上并没有解决额外的对象创建问题，它只是确保这些对象之一不会替换另一个。但是线程之间的竞争条件会导致多个对象实例化。我可以为这个问题找到 3 个解决方案（我会遵循这个优先顺序）：

1- If you are on Java 8, the best way to achieve this is probably the new computeIfAbsentmethod of ConcurrentMap. You just need to give it a computation function which will be executed synchronously (at least for the ConcurrentHashMapimplementation). Example:

1- 如果您使用的是 Java 8，那么实现这一目标的最佳方法可能computeIfAbsent是ConcurrentMap.你只需要给它一个将同步执行的计算函数（至少对于ConcurrentHashMap实现而言）。例子：

private final ConcurrentMap<String, List<String>> entries =
        new ConcurrentHashMap<String, List<String>>();

public void method1(String key, String value) {
    entries.computeIfAbsent(key, s -> new ArrayList<String>())
            .add(value);
}

This is from the javadoc of ConcurrentHashMap.computeIfAbsent:

这是来自的javadoc ConcurrentHashMap.computeIfAbsent：

If the specified key is not already associated with a value, attempts to compute its value using the given mapping function and enters it into this map unless null. The entire method invocation is performed atomically, so the function is applied at most once per key. Some attempted update operations on this map by other threads may be blocked while computation is in progress, so the computation should be short and simple, and must not attempt to update any other mappings of this map.

如果指定的键尚未与值关联，则尝试使用给定的映射函数计算其值并将其输入此映射，除非为空。整个方法调用以原子方式执行，因此每个键最多应用一次该函数。其他线程在此映射上尝试的一些更新操作可能会在计算过程中被阻塞，因此计算应该简短且简单，并且不得尝试更新此映射的任何其他映射。

2- If you cannot use Java 8, you can use Guava's LoadingCache, which is thread-safe. You define a load function to it (just like the computefunction above), and you can be sure that it'll be called synchronously. Example:

2- 如果您不能使用 Java 8，您可以使用Guava's LoadingCache，这是线程安全的。你为它定义了一个加载函数（就像compute上面的函数一样），你可以确定它会被同步调用。例子：

private final LoadingCache<String, List<String>> entries = CacheBuilder.newBuilder()
        .build(new CacheLoader<String, List<String>>() {
            @Override
            public List<String> load(String s) throws Exception {
                return new ArrayList<String>();
            }
        });

public void method2(String key, String value) {
    entries.getUnchecked(key).add(value);
}

3- If you cannot use Guava either, you can always synchronise manually and do a double-checked locking. Example:

3- 如果您也不能使用 Guava，您可以随时手动同步并进行双重检查锁定。例子：

private final ConcurrentMap<String, List<String>> entries =
        new ConcurrentHashMap<String, List<String>>();

public void method3(String key, String value) {
    List<String> existing = entries.get(key);
    if (existing != null) {
        existing.add(value);
    } else {
        synchronized (entries) {
            List<String> existingSynchronized = entries.get(key);
            if (existingSynchronized != null) {
                existingSynchronized.add(value);
            } else {
                List<String> newList = new ArrayList<>();
                newList.add(value);
                entries.put(key, newList);
            }
        }
    }
}

I made an example implementation of all those 3 methods and additionally, the non-synchronized method, which causes extra object creation: http://pastebin.com/qZ4DUjTr

我对所有这 3 种方法以及非同步方法做了一个示例实现，这会导致额外的对象创建：http: //pastebin.com/qZ4DUjTr

Answer 6

回答by Krzysztof Cichocki

The approach with putIfAbsenthas the fastest execution time, it is from 2 to 50 times faster than the "lambda" approach in evironments with high contention. The Lambda isn't the reason behind this "powerloss", the issue is the compulsory synchronisation inside of computeIfAbsent prior to the Java-9 optimisations.

方法putIfAbsent具有最快的执行时间，在高竞争环境中比“lambda”方法快 2 到 50 倍。Lambda 不是这种“功率损失”背后的原因，问题是在 Java-9 优化之前，computeIfAbsent 内部的强制同步。

the benchmark:

基准：

import java.util.Random;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.concurrent.atomic.AtomicLong;

public class ConcurrentHashMapTest {
    private final static int numberOfRuns = 1000000;
    private final static int numberOfThreads = Runtime.getRuntime().availableProcessors();
    private final static int keysSize = 10;
    private final static String[] strings = new String[keysSize];
    static {
        for (int n = 0; n < keysSize; n++) {
            strings[n] = "" + (char) ('A' + n);
        }
    }

    public static void main(String[] args) throws InterruptedException {
        for (int n = 0; n < 20; n++) {
            testPutIfAbsent();
            testComputeIfAbsentLamda();
        }
    }

    private static void testPutIfAbsent() throws InterruptedException {
        final AtomicLong totalTime = new AtomicLong();
        final ConcurrentHashMap<String, AtomicInteger> map = new ConcurrentHashMap<String, AtomicInteger>();
        final Random random = new Random();
        ExecutorService executorService = Executors.newFixedThreadPool(numberOfThreads);

        for (int i = 0; i < numberOfThreads; i++) {
            executorService.execute(new Runnable() {
                @Override
                public void run() {
                    long start, end;
                    for (int n = 0; n < numberOfRuns; n++) {
                        String s = strings[random.nextInt(strings.length)];
                        start = System.nanoTime();

                        AtomicInteger count = map.get(s);
                        if (count == null) {
                            count = new AtomicInteger(0);
                            AtomicInteger prevCount = map.putIfAbsent(s, count);
                            if (prevCount != null) {
                                count = prevCount;
                            }
                        }
                        count.incrementAndGet();
                        end = System.nanoTime();
                        totalTime.addAndGet(end - start);
                    }
                }
            });
        }
        executorService.shutdown();
        executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
        System.out.println("Test " + Thread.currentThread().getStackTrace()[1].getMethodName()
                + " average time per run: " + (double) totalTime.get() / numberOfThreads / numberOfRuns + " ns");
    }

    private static void testComputeIfAbsentLamda() throws InterruptedException {
        final AtomicLong totalTime = new AtomicLong();
        final ConcurrentHashMap<String, AtomicInteger> map = new ConcurrentHashMap<String, AtomicInteger>();
        final Random random = new Random();
        ExecutorService executorService = Executors.newFixedThreadPool(numberOfThreads);
        for (int i = 0; i < numberOfThreads; i++) {
            executorService.execute(new Runnable() {
                @Override
                public void run() {
                    long start, end;
                    for (int n = 0; n < numberOfRuns; n++) {
                        String s = strings[random.nextInt(strings.length)];
                        start = System.nanoTime();

                        AtomicInteger count = map.computeIfAbsent(s, (k) -> new AtomicInteger(0));
                        count.incrementAndGet();

                        end = System.nanoTime();
                        totalTime.addAndGet(end - start);
                    }
                }
            });
        }
        executorService.shutdown();
        executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
        System.out.println("Test " + Thread.currentThread().getStackTrace()[1].getMethodName()
                + " average time per run: " + (double) totalTime.get() / numberOfThreads / numberOfRuns + " ns");
    }

}

The results:

结果：

Test testPutIfAbsent average time per run: 115.756501 ns
Test testComputeIfAbsentLamda average time per run: 276.9667055 ns
Test testPutIfAbsent average time per run: 134.2332435 ns
Test testComputeIfAbsentLamda average time per run: 223.222063625 ns
Test testPutIfAbsent average time per run: 119.968893625 ns
Test testComputeIfAbsentLamda average time per run: 216.707419875 ns
Test testPutIfAbsent average time per run: 116.173902375 ns
Test testComputeIfAbsentLamda average time per run: 215.632467375 ns
Test testPutIfAbsent average time per run: 112.21422775 ns
Test testComputeIfAbsentLamda average time per run: 210.29563725 ns
Test testPutIfAbsent average time per run: 120.50643475 ns
Test testComputeIfAbsentLamda average time per run: 200.79536475 ns

Answer 7

回答by Erdin? Ta?k?n

Waste of memory (also GC etc.) that Empty Array list creation problem is handled with Java 1.7.40. Don't worry about creating empty arraylist. Reference : http://javarevisited.blogspot.com.tr/2014/07/java-optimization-empty-arraylist-and-Hashmap-cost-less-memory-jdk-17040-update.html

使用 Java 1.7.40 处理空数组列表创建问题的内存浪费（还有 GC 等）。不要担心创建空的数组列表。参考：http: //javarevisited.blogspot.com.tr/2014/07/java-optimization-empty-arraylist-and-Hashmap-cost-less-memory-jdk-17040-update.html

java ConcurrentHashMap：避免使用“putIfAbsent”创建额外的对象？

提问by Gene Golovchinsky

回答by Bohemian

回答by Peter

回答by Gene Golovchinsky

回答by fracca

回答by Utku ?zdemir

回答by Krzysztof Cichocki

回答by Erdin? Ta?k?n

相关推荐

最近更新

标签

java ConcurrentHashMap：避免使用“putIfAbsent”创建额外的对象？

提问by Gene Golovchinsky

回答by Bohemian

回答by Peter

回答by Gene Golovchinsky

回答by fracca

回答by Utku ?zdemir

回答by Krzysztof Cichocki

回答by Erdin? Ta?k?n

相关推荐

java 在Java中显示数组中的第二大数字

java 如何将文档侦听器添加到面板内的 JTextFields？

java 使用带有花括号“{”的 split() 时出错

java 从原始数据的字节数组中获取缓冲图像

相关推荐

最近更新

标签