Java 在使用 ConcurrentMap 的 putIfAbsent 之前,您是否应该检查地图是否包含密钥

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3752194/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-14 04:23:42  来源:igfitidea点击:

Should you check if the map containsKey before using ConcurrentMap's putIfAbsent

javaperformanceconcurrencyconcurrenthashmap

提问by Chris Dail

I have been using Java's ConcurrentMap for a map that can be used from multiple threads. The putIfAbsent is a great method and is much easier to read/write than using standard map operations. I have some code that looks like this:

我一直在将 Java 的 ConcurrentMap 用于可以从多个线程使用的地图。putIfAbsent 是一种很好的方法,并且比使用标准映射操作更容易读/写。我有一些看起来像这样的代码:

ConcurrentMap<String, Set<X>> map = new ConcurrentHashMap<String, Set<X>>();

// ...

map.putIfAbsent(name, new HashSet<X>());
map.get(name).add(Y);

Readability wise this is great but it does require creating a new HashSet every time even if it is already in the map. I could write this:

可读性很好,但它确实需要每次创建一个新的 HashSet,即使它已经在地图中。我可以这样写:

if (!map.containsKey(name)) {
    map.putIfAbsent(name, new HashSet<X>());
}
map.get(name).add(Y);

With this change it loses a bit of readability but does not need to create the HashSet every time. Which is better in this case? I tend to side with the first one since it is more readable. The second would perform better and may be more correct. Maybe there is a better way to do this than either of these.

通过此更改,它失去了一些可读性,但不需要每次都创建 HashSet。在这种情况下哪个更好?我倾向于支持第一个,因为它更具可读性。第二个会表现得更好,可能更正确。也许有比这两种方法更好的方法来做到这一点。

What is the best practice for using a putIfAbsent in this manner?

以这种方式使用 putIfAbsent 的最佳实践是什么?

采纳答案by Tom Hawtin - tackline

Concurrency is hard. If you are going to bother with concurrent maps instead of straightforward locking, you might as well go for it. Indeed, don't do lookups more than necessary.

并发很难。如果您打算使用并发映射而不是直接锁定,那么您不妨尝试一下。确实,不要进行不必要的查找。

Set<X> set = map.get(name);
if (set == null) {
    final Set<X> value = new HashSet<X>();
    set = map.putIfAbsent(name, value);
    if (set == null) {
        set = value;
    }
}

(Usual stackoverflow disclaimer: Off the top of my head. Not tested. Not compiled. Etc.)

(通常的 stackoverflow 免责声明:在我的脑海中。未测试。未编译。等等。)

Update:1.8 has added computeIfAbsentdefault method to ConcurrentMap(and Mapwhich is kind of interesting because that implementation would be wrong for ConcurrentMap). (And 1.7 added the "diamond operator" <>.)

更新:1.8 添加了computeIfAbsent默认方法ConcurrentMapMap这很有趣,因为该实现对于 来说是错误的ConcurrentMap)。(并且 1.7 添加了“菱形运算符” <>。)

Set<X> set = map.computeIfAbsent(name, n -> new HashSet<>());

(Note, you are responsible for the thread-safety of any operations of the HashSets contained in the ConcurrentMap.)

(注意,您负责HashSets 中包含的s的任何操作的线程安全ConcurrentMap。)

回答by Jed Wesley-Smith

Tom's answer is correct as far as API usage goes for ConcurrentMap. An alternative that avoids using putIfAbsent is to use the computing map from the GoogleCollections/Guava MapMaker which auto-populates the values with a supplied function and handles all the thread-safety for you. It actually only creates one value per key and if the create function is expensive, other threads asking getting the same key will block until the value becomes available.

就 ConcurrentMap 的 API 使用而言,Tom 的回答是正确的。避免使用 putIfAbsent 的另一种方法是使用来自 GoogleCollections/Guava MapMaker 的计算地图,它使用提供的函数自动填充值并为您处理所有线程安全。它实际上只为每个键创建一个值,如果创建函数很昂贵,则其他请求获取相同键的线程将阻塞,直到该值可用为止。

Editfrom Guava 11, MapMaker is deprecated and being replaced with the Cache/LocalCache/CacheBuilder stuff. This is a little more complicated in its usage but basically isomorphic.

从 Guava 11 开始编辑,MapMaker 已被弃用并被 Cache/LocalCache/CacheBuilder 内容取代。这在使用上有点复杂,但基本上是同构的。

回答by karmakaze

By keeping a pre-initialized value for each thread you can improve on the accepted answer:

通过为每个线程保留一个预先初始化的值,您可以改进已接受的答案:

Set<X> initial = new HashSet<X>();
...
Set<X> set = map.putIfAbsent(name, initial);
if (set == null) {
    set = initial;
    initial = new HashSet<X>();
}
set.add(Y);

I recently used this with AtomicInteger map values rather than Set.

我最近将它与 AtomicInteger 映射值而不是 Set 一起使用。

回答by ggrandes

My generic approximation:

我的一般近似:

public class ConcurrentHashMapWithInit<K, V> extends ConcurrentHashMap<K, V> {
  private static final long serialVersionUID = 42L;

  public V initIfAbsent(final K key) {
    V value = get(key);
    if (value == null) {
      value = initialValue();
      final V x = putIfAbsent(key, value);
      value = (x != null) ? x : value;
    }
    return value;
  }

  protected V initialValue() {
    return null;
  }
}

And as example of use:

并作为使用示例:

public static void main(final String[] args) throws Throwable {
  ConcurrentHashMapWithInit<String, HashSet<String>> map = 
        new ConcurrentHashMapWithInit<String, HashSet<String>>() {
    private static final long serialVersionUID = 42L;

    @Override
    protected HashSet<String> initialValue() {
      return new HashSet<String>();
    }
  };
  map.initIfAbsent("s1").add("chao");
  map.initIfAbsent("s2").add("bye");
  System.out.println(map.toString());
}

回答by Craig P. Motlin

You can use MutableMap.getIfAbsentPut(K, Function0<? extends V>)from Eclipse Collections(formerly GS Collections).

您可以使用MutableMap.getIfAbsentPut(K, Function0<? extends V>)来自Eclipse Collections(以前称为GS Collections)。

The advantage over calling get(), doing a null check, and then calling putIfAbsent()is that we'll only compute the key's hashCode once, and find the right spot in the hashtable once. In ConcurrentMaps like org.eclipse.collections.impl.map.mutable.ConcurrentHashMap, the implementation of getIfAbsentPut()is also thread-safe and atomic.

与调用get(),进行空检查,然后调用相比的优势putIfAbsent()在于,我们将只计算一次键的哈希码,并在哈希表中找到一次正确的位置。在 ConcurrentMaps 中org.eclipse.collections.impl.map.mutable.ConcurrentHashMap, 的实现getIfAbsentPut()也是线程安全和原子的。

import org.eclipse.collections.impl.map.mutable.ConcurrentHashMap;
...
ConcurrentHashMap<String, MyObject> map = new ConcurrentHashMap<>();
map.getIfAbsentPut("key", () -> someExpensiveComputation());

The implementation of org.eclipse.collections.impl.map.mutable.ConcurrentHashMapis truly non-blocking. While every effort is made not to call the factory function unnecessarily, there's still a chance it will be called more than once during contention.

的实现org.eclipse.collections.impl.map.mutable.ConcurrentHashMap是真正的非阻塞。尽管已尽一切努力避免不必要地调用工厂函数,但在争用期间仍有可能多次调用它。

This fact sets it apart from Java 8's ConcurrentHashMap.computeIfAbsent(K, Function<? super K,? extends V>). The Javadoc for this method states:

这一事实使其与 Java 8 的ConcurrentHashMap.computeIfAbsent(K, Function<? super K,? extends V>). 此方法的 Javadoc 指出:

The entire method invocation is performed atomically, so the function is applied at most once per key. Some attempted update operations on this map by other threads may be blocked while computation is in progress, so the computation should be short and simple...

整个方法调用以原子方式执行,因此每个键最多应用一次该函数。其他线程在此映射上尝试的一些更新操作可能会在计算过程中被阻塞,因此计算应该简短而简单......

Note: I am a committer for Eclipse Collections.

注意:我是 Eclipse Collections 的提交者。

回答by Nathan

In 5+ years, I can't believe no one has mentioned or posted a solution that uses ThreadLocalto solve this problem; and several of the solutions on this page are not threadsafeand are just sloppy.

在 5 年多的时间里,我不敢相信没有人提到或发布过使用ThreadLocal解决此问题的解决方案;并且此页面上的一些解决方案不是线程安全的,只是草率。

Using ThreadLocals for this specific problem isn't only considered best practicesfor concurrency, but for minimizing garbage/object creation duringthread contention. Also, it's incredibly clean code.

对这个特定问题使用 ThreadLocals 不仅被认为是并发的最佳实践,而且被认为是线程争用期间最小化垃圾/对象创建。此外,它的代码非常干净。

For example:

例如:

private final ThreadLocal<HashSet<X>> 
  threadCache = new ThreadLocal<HashSet<X>>() {
      @Override
      protected
      HashSet<X> initialValue() {
          return new HashSet<X>();
      }
  };


private final ConcurrentMap<String, Set<X>> 
  map = new ConcurrentHashMap<String, Set<X>>();

And the actual logic...

而实际的逻辑...

// minimize object creation during thread contention
final Set<X> cached = threadCache.get();

Set<X> data = map.putIfAbsent("foo", cached);
if (data == null) {
    // reset the cached value in the ThreadLocal
    listCache.set(new HashSet<X>());
    data = cached;
}

// make sure that the access to the set is thread safe
synchronized(data) {
    data.add(object);
}