java ConcurrentHashMap:避免使用“putIfAbsent”创建额外的对象?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/10743622/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
ConcurrentHashMap: avoid extra object creation with "putIfAbsent"?
提问by Gene Golovchinsky
I am aggregating multiple values for keys in a multi-threaded environment. The keys are not known in advance. I thought I would do something like this:
我正在多线程环境中为键聚合多个值。密钥是事先不知道的。我以为我会做这样的事情:
class Aggregator {
protected ConcurrentHashMap<String, List<String>> entries =
new ConcurrentHashMap<String, List<String>>();
public Aggregator() {}
public void record(String key, String value) {
List<String> newList =
Collections.synchronizedList(new ArrayList<String>());
List<String> existingList = entries.putIfAbsent(key, newList);
List<String> values = existingList == null ? newList : existingList;
values.add(value);
}
}
The problem I see is that every time this method runs, I need to create a new instance of an ArrayList
, which I then throw away (in most cases). This seems like unjustified abuse of the garbage collector. Is there a better, thread-safe way of initializing this kind of a structure without having to synchronize
the record
method? I am somewhat surprised by the decision to have the putIfAbsent
method not return the newly-created element, and by the lack of a way to defer instantiation unless it is called for (so to speak).
我看到的问题是,每次运行此方法时,我都需要创建一个 的新实例,ArrayList
然后将其丢弃(在大多数情况下)。这似乎是对垃圾收集器的无理滥用。有没有更好的、线程安全的方法来初始化这种结构而不必使用synchronize
该record
方法?我对让putIfAbsent
方法不返回新创建的元素的决定感到有些惊讶,并且缺乏一种方法来推迟实例化,除非它被调用(可以这么说)。
回答by Bohemian
Java 8 introduced an API to cater for this exact problem, making a 1-line solution:
Java 8 引入了一个 API 来解决这个确切的问题,提出了一个 1-line 解决方案:
public void record(String key, String value) {
entries.computeIfAbsent(key, k -> Collections.synchronizedList(new ArrayList<String>())).add(value);
}
For Java 7:
对于 Java 7:
public void record(String key, String value) {
List<String> values = entries.get(key);
if (values == null) {
entries.putIfAbsent(key, Collections.synchronizedList(new ArrayList<String>()));
// At this point, there will definitely be a list for the key.
// We don't know or care which thread's new object is in there, so:
values = entries.get(key);
}
values.add(value);
}
This is the standard code pattern when populating a ConcurrentHashMap
.
这是填充ConcurrentHashMap
.
The special method putIfAbsent(K, V))
will either put your value object in, or if another thread got before you, then it will ignore your value object. Either way, after the call to putIfAbsent(K, V))
, get(key)
is guaranteed to be consistent between threads and therefore the above code is threadsafe.
特殊方法putIfAbsent(K, V))
要么将您的值对象放入,要么如果另一个线程在您之前,则它将忽略您的值对象。无论哪种方式,在调用 之后putIfAbsent(K, V))
,get(key)
都可以保证线程之间的一致性,因此上面的代码是线程安全的。
The only wasted overhead is if some other thread adds a new entry at the same time for the same key: You mayend up throwing away the newly created value, but that only happens if there is not already an entry andthere's a race that your thread loses, which would typically be rare.
唯一浪费的开销是,如果某个其他线程同时为同一个键添加了一个新条目:您最终可能会丢弃新创建的值,但只有在还没有条目并且您的竞争中存在竞争时才会发生这种情况线程丢失,这通常很少见。
回答by Peter
As of Java-8 you can create Multi Maps using the following pattern:
从 Java-8 开始,您可以使用以下模式创建多映射:
public void record(String key, String value) {
entries.computeIfAbsent(key,
k -> Collections.synchronizedList(new ArrayList<String>()))
.add(value);
}
public void record(String key, String value) {
entries.computeIfAbsent(key,
k -> Collections.synchronizedList(new ArrayList<String>()))
.add(value);
}
The ConcurrentHashMap documentation (not the general contract) specifies that the ArrayList will only be created once for each key, at the slight initial cost of delaying updates while the ArrayList is being created for a new key:
ConcurrentHashMap 文档(不是一般合同)指定 ArrayList 只会为每个键创建一次,在为新键创建 ArrayList 时延迟更新的初始成本很小:
回答by Gene Golovchinsky
In the end, I implemented a slight modification of @Bohemian's answer. His proposed solution overwrites the values
variable with the putIfAbsent
call, which creates the same problem I had before. The code that seems to work looks like this:
最后,我对@Bohemian 的回答稍作修改。他提出的解决方案values
用putIfAbsent
调用覆盖了变量,这产生了我以前遇到的同样问题。似乎有效的代码如下所示:
public void record(String key, String value) {
List<String> values = entries.get(key);
if (values == null) {
values = Collections.synchronizedList(new ArrayList<String>());
List<String> values2 = entries.putIfAbsent(key, values);
if (values2 != null)
values = values2;
}
values.add(value);
}
It's not as elegant as I'd like, but it's better than the original that creates a new ArrayList
instance at every call.
它不像我想要的那么优雅,但它比ArrayList
每次调用都创建一个新实例的原始方法要好。
回答by fracca
Created two versions based on Gene's answer
根据 Gene 的回答创建了两个版本
public static <K,V> void putIfAbsetMultiValue(ConcurrentHashMap<K,List<V>> entries, K key, V value) {
List<V> values = entries.get(key);
if (values == null) {
values = Collections.synchronizedList(new ArrayList<V>());
List<V> values2 = entries.putIfAbsent(key, values);
if (values2 != null)
values = values2;
}
values.add(value);
}
public static <K,V> void putIfAbsetMultiValueSet(ConcurrentMap<K,Set<V>> entries, K key, V value) {
Set<V> values = entries.get(key);
if (values == null) {
values = Collections.synchronizedSet(new HashSet<V>());
Set<V> values2 = entries.putIfAbsent(key, values);
if (values2 != null)
values = values2;
}
values.add(value);
}
It works well
它运作良好
回答by Utku ?zdemir
This is a problem I also looked for an answer. The method putIfAbsent
does not actually solve the extra object creation problem, it just makes sure that one of those objects doesn't replace another. But the race conditions among threads can cause multiple object instantiation. I could find 3 solutions for this problem (And I would follow this order of preference):
这是一个问题,我也在寻找答案。该方法putIfAbsent
实际上并没有解决额外的对象创建问题,它只是确保这些对象之一不会替换另一个。但是线程之间的竞争条件会导致多个对象实例化。我可以为这个问题找到 3 个解决方案(我会遵循这个优先顺序):
1- If you are on Java 8, the best way to achieve this is probably the new computeIfAbsent
method of ConcurrentMap
. You just need to give it a computation function which will be executed synchronously (at least for the ConcurrentHashMap
implementation). Example:
1- 如果您使用的是 Java 8,那么实现这一目标的最佳方法可能computeIfAbsent
是ConcurrentMap
.你只需要给它一个将同步执行的计算函数(至少对于ConcurrentHashMap
实现而言)。例子:
private final ConcurrentMap<String, List<String>> entries =
new ConcurrentHashMap<String, List<String>>();
public void method1(String key, String value) {
entries.computeIfAbsent(key, s -> new ArrayList<String>())
.add(value);
}
This is from the javadoc of ConcurrentHashMap.computeIfAbsent
:
这是来自的javadoc ConcurrentHashMap.computeIfAbsent
:
If the specified key is not already associated with a value, attempts to compute its value using the given mapping function and enters it into this map unless null. The entire method invocation is performed atomically, so the function is applied at most once per key. Some attempted update operations on this map by other threads may be blocked while computation is in progress, so the computation should be short and simple, and must not attempt to update any other mappings of this map.
如果指定的键尚未与值关联,则尝试使用给定的映射函数计算其值并将其输入此映射,除非为空。整个方法调用以原子方式执行,因此每个键最多应用一次该函数。其他线程在此映射上尝试的一些更新操作可能会在计算过程中被阻塞,因此计算应该简短且简单,并且不得尝试更新此映射的任何其他映射。
2- If you cannot use Java 8, you can use Guava
's LoadingCache
, which is thread-safe. You define a load function to it (just like the compute
function above), and you can be sure that it'll be called synchronously. Example:
2- 如果您不能使用 Java 8,您可以使用Guava
's LoadingCache
,这是线程安全的。你为它定义了一个加载函数(就像compute
上面的函数一样),你可以确定它会被同步调用。例子:
private final LoadingCache<String, List<String>> entries = CacheBuilder.newBuilder()
.build(new CacheLoader<String, List<String>>() {
@Override
public List<String> load(String s) throws Exception {
return new ArrayList<String>();
}
});
public void method2(String key, String value) {
entries.getUnchecked(key).add(value);
}
3- If you cannot use Guava either, you can always synchronise manually and do a double-checked locking. Example:
3- 如果您也不能使用 Guava,您可以随时手动同步并进行双重检查锁定。例子:
private final ConcurrentMap<String, List<String>> entries =
new ConcurrentHashMap<String, List<String>>();
public void method3(String key, String value) {
List<String> existing = entries.get(key);
if (existing != null) {
existing.add(value);
} else {
synchronized (entries) {
List<String> existingSynchronized = entries.get(key);
if (existingSynchronized != null) {
existingSynchronized.add(value);
} else {
List<String> newList = new ArrayList<>();
newList.add(value);
entries.put(key, newList);
}
}
}
}
I made an example implementation of all those 3 methods and additionally, the non-synchronized method, which causes extra object creation: http://pastebin.com/qZ4DUjTr
我对所有这 3 种方法以及非同步方法做了一个示例实现,这会导致额外的对象创建:http: //pastebin.com/qZ4DUjTr
回答by Krzysztof Cichocki
The approach with putIfAbsent
has the fastest execution time, it is from 2 to 50 times faster than the "lambda" approach in evironments with high contention. The Lambda isn't the reason behind this "powerloss", the issue is the compulsory synchronisation inside of computeIfAbsent prior to the Java-9 optimisations.
方法putIfAbsent
具有最快的执行时间,在高竞争环境中比“lambda”方法快 2 到 50 倍。Lambda 不是这种“功率损失”背后的原因,问题是在 Java-9 优化之前,computeIfAbsent 内部的强制同步。
the benchmark:
基准:
import java.util.Random;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.concurrent.atomic.AtomicLong;
public class ConcurrentHashMapTest {
private final static int numberOfRuns = 1000000;
private final static int numberOfThreads = Runtime.getRuntime().availableProcessors();
private final static int keysSize = 10;
private final static String[] strings = new String[keysSize];
static {
for (int n = 0; n < keysSize; n++) {
strings[n] = "" + (char) ('A' + n);
}
}
public static void main(String[] args) throws InterruptedException {
for (int n = 0; n < 20; n++) {
testPutIfAbsent();
testComputeIfAbsentLamda();
}
}
private static void testPutIfAbsent() throws InterruptedException {
final AtomicLong totalTime = new AtomicLong();
final ConcurrentHashMap<String, AtomicInteger> map = new ConcurrentHashMap<String, AtomicInteger>();
final Random random = new Random();
ExecutorService executorService = Executors.newFixedThreadPool(numberOfThreads);
for (int i = 0; i < numberOfThreads; i++) {
executorService.execute(new Runnable() {
@Override
public void run() {
long start, end;
for (int n = 0; n < numberOfRuns; n++) {
String s = strings[random.nextInt(strings.length)];
start = System.nanoTime();
AtomicInteger count = map.get(s);
if (count == null) {
count = new AtomicInteger(0);
AtomicInteger prevCount = map.putIfAbsent(s, count);
if (prevCount != null) {
count = prevCount;
}
}
count.incrementAndGet();
end = System.nanoTime();
totalTime.addAndGet(end - start);
}
}
});
}
executorService.shutdown();
executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
System.out.println("Test " + Thread.currentThread().getStackTrace()[1].getMethodName()
+ " average time per run: " + (double) totalTime.get() / numberOfThreads / numberOfRuns + " ns");
}
private static void testComputeIfAbsentLamda() throws InterruptedException {
final AtomicLong totalTime = new AtomicLong();
final ConcurrentHashMap<String, AtomicInteger> map = new ConcurrentHashMap<String, AtomicInteger>();
final Random random = new Random();
ExecutorService executorService = Executors.newFixedThreadPool(numberOfThreads);
for (int i = 0; i < numberOfThreads; i++) {
executorService.execute(new Runnable() {
@Override
public void run() {
long start, end;
for (int n = 0; n < numberOfRuns; n++) {
String s = strings[random.nextInt(strings.length)];
start = System.nanoTime();
AtomicInteger count = map.computeIfAbsent(s, (k) -> new AtomicInteger(0));
count.incrementAndGet();
end = System.nanoTime();
totalTime.addAndGet(end - start);
}
}
});
}
executorService.shutdown();
executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
System.out.println("Test " + Thread.currentThread().getStackTrace()[1].getMethodName()
+ " average time per run: " + (double) totalTime.get() / numberOfThreads / numberOfRuns + " ns");
}
}
The results:
结果:
Test testPutIfAbsent average time per run: 115.756501 ns
Test testComputeIfAbsentLamda average time per run: 276.9667055 ns
Test testPutIfAbsent average time per run: 134.2332435 ns
Test testComputeIfAbsentLamda average time per run: 223.222063625 ns
Test testPutIfAbsent average time per run: 119.968893625 ns
Test testComputeIfAbsentLamda average time per run: 216.707419875 ns
Test testPutIfAbsent average time per run: 116.173902375 ns
Test testComputeIfAbsentLamda average time per run: 215.632467375 ns
Test testPutIfAbsent average time per run: 112.21422775 ns
Test testComputeIfAbsentLamda average time per run: 210.29563725 ns
Test testPutIfAbsent average time per run: 120.50643475 ns
Test testComputeIfAbsentLamda average time per run: 200.79536475 ns
回答by Erdin? Ta?k?n
Waste of memory (also GC etc.) that Empty Array list creation problem is handled with Java 1.7.40. Don't worry about creating empty arraylist. Reference : http://javarevisited.blogspot.com.tr/2014/07/java-optimization-empty-arraylist-and-Hashmap-cost-less-memory-jdk-17040-update.html
使用 Java 1.7.40 处理空数组列表创建问题的内存浪费(还有 GC 等)。不要担心创建空的数组列表。参考:http: //javarevisited.blogspot.com.tr/2014/07/java-optimization-empty-arraylist-and-Hashmap-cost-less-memory-jdk-17040-update.html