java ConcurrentHashMap.get() 是否保证通过不同的线程看到先前的 ConcurrentHashMap.put()?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1770166/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-29 17:53:51  来源:igfitidea点击:

Is ConcurrentHashMap.get() guaranteed to see a previous ConcurrentHashMap.put() by different thread?

javamultithreadingconcurrenthashmap

提问by Stu Thompson

Is ConcurrentHashMap.get()guaranteedto see a previous ConcurrentHashMap.put()by different thread? My expectation is that is is, and reading the JavaDocs seems to indicate so, but I am 99% convinced that reality is different. On my production server the below seemsto be happening. (I've caught it with logging.)

保证看到一个以前由不同的线程?我的期望是,阅读 JavaDocs 似乎表明了这一点,但我 99% 相信现实是不同的。在我的生产服务器上,以下似乎正在发生。(我用日志记录了它。)ConcurrentHashMap.get()ConcurrentHashMap.put()

Pseudo code example:

伪代码示例:

static final ConcurrentHashMap map = new ConcurrentHashMap();
//sharedLock is key specific.  One map, many keys.  There is a 1:1 
//      relationship between key and Foo instance.
void doSomething(Semaphore sharedLock) {
    boolean haveLock = sharedLock.tryAcquire(3000, MILLISECONDS);

    if (haveLock) {
        log("Have lock: " + threadId);
        Foo foo = map.get("key");
        log("foo=" + foo);

        if (foo == null) {
            log("New foo time! " + threadId);
            foo = new Foo(); //foo is expensive to instance
            map.put("key", foo);

        } else
            log("Found foo:" + threadId);

        log("foo=" + foo);
        sharedLock.release();

    } else
        log("No lock acquired");
} 

What seems to be happening is this:

似乎正在发生的事情是这样的:

Thread 1                          Thread 2
 - request lock                    - request lock
 - have lock                       - blocked waiting for lock
 - get from map, nothing there
 - create new foo
 - place new foo in map
 - logs foo.toString()
 - release lock
 - exit method                     - have lock
                                   - get from map, NOTHING THERE!!! (Why not?)
                                   - create new foo
                                   - place new foo in map
                                   - logs foo.toString()
                                   - release lock
                                   - exit method

So, my output looks like this:

所以,我的输出是这样的:

Have lock: 1    
foo=null
New foo time! 1
foo=foo@cafebabe420
Have lock: 2    
foo=null
New foo time! 2
foo=foo@boof00boo    

The second thread does not immediately see the put! Why? On my production system, there are more threads and I've only seen one thread, the first one that immediately follows thread 1, have a problem.

第二个线程不会立即看到 put!为什么?在我的生产系统上,有更多的线程,我只看到一个线程,紧跟在线程 1 之后的第一个线程有问题。

I've even tried shrinking the concurrency level on ConcurrentHashMap to 1, not that it should matter. E.g.:

我什至尝试将 ConcurrentHashMap 上的并发级别缩小到 1,但这并不重要。例如:

static ConcurrentHashMap map = new ConcurrentHashMap(32, 1);

Where am I going wrong? My expectation? Or is there some bug in my code (the real software, not the above) that is causing this? I've gone over it repeatedly and am 99% sure I'm handling the locking correctly. I cannot even fathom a bug in ConcurrentHashMapor the JVM. Please save me from myself.

我哪里错了?我的期望?或者我的代码(真正的软件,而不是上面的)中是否存在导致这种情况的错误?我已经反复检查了它,并且 99% 确定我正确地处理了锁定。我什至无法理解ConcurrentHashMapJVM 中的错误。 请救我脱离自己。

Gorey specifics that might be relevant:

可能相关的 Gorey 细节:

  • quad-core 64-bit Xeon (DL380 G5)
  • RHEL4 (Linux mysvr 2.6.9-78.0.5.ELsmp #1 SMP... x86_64 GNU/Linux)
  • Java 6 (build 1.6.0_07-b06, 64-Bit Server VM (build 10.0-b23, mixed mode))
  • 四核 64 位至强 (DL380 G5)
  • RHEL4 ( Linux mysvr 2.6.9-78.0.5.ELsmp #1 SMP... x86_64 GNU/Linux)
  • Java 6 ( build 1.6.0_07-b06, 64-Bit Server VM (build 10.0-b23, mixed mode))

采纳答案by Cowan

Some good answers here, but as far as I can tell no-one has actually provided a canonical answer to the question asked: "Is ConcurrentHashMap.get() guaranteed to see a previous ConcurrentHashMap.put() by different thread". Those that have said yes haven't provided a source.

这里有一些很好的答案,但据我所知,实际上没有人对所提出的问题提供规范的答案:“ConcurrentHashMap.get() 是否保证通过不同的线程看到先前的 ConcurrentHashMap.put()”。那些说是的人没有提供消息来源。

So: yes, it is guaranteed. Source(see the section 'Memory Consistency Properties'):

所以:是的,这是有保证的。来源(请参阅“内存一致性属性”部分):

Actions in a thread prior to placing an object into any concurrent collection happen-before actions subsequent to the access or removal of that element from the collection in another thread.

在将对象放入任何并发集合之前线程中的操作发生在从另一个线程中的集合访问或删除该元素之后的操作之前。

回答by David Roussel

This issue of creating an expensive-to-create object in a cache based on a failure to find it in the cache is known problem. And fortunately this had already been implemented.

基于在缓存中找不到对象而在缓存中创建创建成本高昂的对象的问题是已知问题。幸运的是,这已经实施了。

You can use MapMakerfrom Google Collecitons. You just give it a callback that creates your object, and if the client code looks in the map and the map is empty, the callback is called and the result put in the map.

您可以使用地图制作工具谷歌Collecitons。您只需给它一个创建对象的回调,如果客户端代码查看地图并且地图为空,则调用回调并将结果放入地图中。

See MapMaker javadocs...

请参阅MapMaker javadocs...

 ConcurrentMap<Key, Graph> graphs = new MapMaker()
       .concurrencyLevel(32)
       .softKeys()
       .weakValues()
       .expiration(30, TimeUnit.MINUTES)
       .makeComputingMap(
           new Function<Key, Graph>() {
             public Graph apply(Key key) {
               return createExpensiveGraph(key);
             }
           });

BTW, in your original example there is no advantage to using a ConcurrentHashMap, as you are locking each access, why not just use a normal HashMap inside your locked section?

顺便说一句,在您的原始示例中,使用 ConcurrentHashMap 没有任何优势,因为您正在锁定每次访问,为什么不在锁定部分中使用普通的 HashMap 呢?

回答by Ankit Kumar

If a thread puts a value in concurrent hash map then some other thread that retrieves the value for the map is guaranteed to see the values inserted by the previous thread.

如果一个线程将一个值放入并发散列映射中,那么其他一些检索映射值的线程肯定会看到前一个线程插入的值。

This issue has been clarified in "Java Concurrency in Practice" by Joshua Bloch.

这个问题已在 Joshua Bloch 的“Java Concurrency in Practice”中阐明。

Quoting from the text :-

引用文字:-

The thread-safe library collections offer the following safe publication guarantees, even if the javadoc is less than clear on the subject:

  • Placing a key or value in a Hashtable, synchronizedMapor Concurrent-Mapsafely publishes it to any other thread that retrieves it from the Map (whether directly or via an iterator);

线程安全库集合提供以下安全发布保证,即使 javadoc 对主题不太清楚:

  • 将键或值放在 a 中HashtablesynchronizedMapConcurrent-Map安全地将其发布到任何其他从 Map 检索它的线程(无论是直接还是通过迭代器);

回答by Andrzej Doyle

One thing to consider, is whether your keys are equal and have identical hashcodes at both times of the "get" call. If they're just Strings then yes, there's not going to be a problem here. But as you haven't given the generic type of the keys, and you have elided "unimportant" details in the pseudocode, I wonder if you're using another class as a key.

需要考虑的一件事是,您的密钥是否相等并且在两次“get”调用时是否具有相同的哈希码。如果他们只是Strings 那么是的,这里不会有问题。但是由于您没有给出键的通用类型,并且您在伪代码中省略了“不重要”的细节,我想知道您是否使用另一个类作为键。

In any case, you may want to additionally log the hashcode of the keys used for the gets/puts in threads 1 and 2. If these are different, you have your problem. Also note that key1.equals(key2)must be true; this isn't something you can log definitively, but if the keys aren't final classes it would be worth logging their fully qualified class name, then looking at the equals() method for that class/classes to see if it's possible that the second key could be considered unequal to the first.

在任何情况下,您可能都希望额外记录线程 1 和线程 2 中用于获取/放置的键的哈希码。如果它们不同,则说明您有问题。另请注意,key1.equals(key2)必须为真;这不是您可以明确记录的内容,但是如果键不是最终类,则值得记录它们的完全限定类名,然后查看该类/类的 equals() 方法以查看是否有可能第二个键可以被认为与第一个不相等。

And to answer your title - yes, ConcurrentHashMap.get() is guaranteed to see any previous put(), where "previous" means there is a happens-beforerelationship between the two as specified by the Java Memory Model. (For the ConcurrentHashMap in particular, this is essentially what you'd expect, with the caveat that you may not be able to tell which happens first if both threads execute at "exactly the same time" on different cores. In your case, though, you should definitely see the result of the put() in thread 2).

并回答您的问题 - 是的,ConcurrentHashMap.get() 保证可以看到任何先前的 put(),其中“先前”表示Java 内存模型指定的两者之间存在发生之前的关系。(特别是对于 ConcurrentHashMap,这基本上是您所期望的,但需要注意的是,如果两个线程在不同内核上“完全同时”执行,您可能无法判断哪个先发生。但在您的情况下,您肯定应该在线程 2) 中看到 put() 的结果。

回答by Arne Deutsch

I don't think the problem is in "ConcurrentHashMap" but rather somewhere in your code or about the reasoning about your code. I can't spot the error in the code above (maybe we just don't see the bad part?).

我认为问题不在于“ConcurrentHashMap”,而在于您的代码中的某个地方或关于您的代码的推理。我无法在上面的代码中发现错误(也许我们只是没有看到不好的部分?)。

But to answer your question "Is ConcurrentHashMap.get() guaranteed to see a previous ConcurrentHashMap.put() by different thread?" I've hacked together a small test program.

但是要回答您的问题“ConcurrentHashMap.get() 是否保证通过不同的线程看到先前的 ConcurrentHashMap.put()?” 我已经编写了一个小测试程序。

In short: No, ConcurrentHashMap is OK!

简而言之:不,ConcurrentHashMap 没问题!

If the map is written badly the following program shoukd print "Bad access!" at least from time to time. It throws 100 Threads with 100000 calls to the method you outlined above. But it prints "All ok!".

如果地图写得不好,下面的程序会打印“Bad access!” 至少不时。它抛出了 100 个线程,对上面概述的方法进行了 100000 次调用。但它打印“一切正常!”。

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Semaphore;
import java.util.concurrent.TimeUnit;

public class Test {
    private final static ConcurrentHashMap<String, Test> map = new ConcurrentHashMap<String, Test>();
    private final static Semaphore lock = new Semaphore(1);
    private static int counter = 0;

    public static void main(String[] args) throws InterruptedException {
        ExecutorService pool = Executors.newFixedThreadPool(100);
        List<Callable<Boolean>> testCalls = new ArrayList<Callable<Boolean>>();
        for (int n = 0; n < 100000; n++)
            testCalls.add(new Callable<Boolean>() {
                @Override
                public Boolean call() throws Exception {
                    doSomething(lock);
                    return true;
                }
            });
        pool.invokeAll(testCalls);
        pool.shutdown();
        pool.awaitTermination(5, TimeUnit.SECONDS);
        System.out.println("All ok!");
    }

    static void doSomething(Semaphore lock) throws InterruptedException {
        boolean haveLock = lock.tryAcquire(3000, TimeUnit.MILLISECONDS);

        if (haveLock) {
            Test foo = map.get("key");
            if (foo == null) {
                foo = new Test();
                map.put("key", new Test());
                if (counter > 0)
                    System.err.println("Bad access!");
                counter++;
            }
            lock.release();
        } else {
            System.err.println("Fail to lock!");
        }
    }
}

回答by overthink

Update:putIfAbsent()is logically correct here, but doesn't avoid the problem of only creating a Foo in the case where the key is not present. It always creates the Foo, even if it doesn't end up putting it in the map. David Roussel's answer is good, assuming you can accept the Google Collections dependency in your app.

更新:putIfAbsent()这里在逻辑上是正确的,但并不能避免在密钥不存在的情况下仅创建 Foo 的问题。它总是创建 Foo,即使它最终没有将它放入地图中。David Roussel 的回答很好,假设您可以接受应用程序中的 Google Collections 依赖项。



Maybe I'm missing something obvious, but why are you guarding the map with a Semaphore? ConcurrentHashMap(CHM) is thread-safe (assuming it's safely published, which it is here). If you're trying to get atomic "put if not already in there", use chm.putIfAbsent(). If you need more complciated invariants where the map contents cannot change, you probably need to use a regular HashMap and synchronize it as usual.

也许我遗漏了一些明显的东西,但你为什么用信号量保护地图? ConcurrentHashMap(CHM) 是线程安全的(假设它是安全发布的,它在这里)。如果您想获得原子“如果还没有放在那里”,请使用 chm。putIfAbsent(). 如果您需要更复杂的不变量,而地图内容不能更改,您可能需要使用常规的 HashMap 并像往常一样同步它。

To answer your question more directly: Once your put returns, the value you put in the map is guaranteed to be seen by the next thread that looks for it.

更直接地回答你的问题:一旦你的 put 返回,你放在地图中的值保证会被下一个寻找它的线程看到。

Side note, just a +1 to some other comments about putting the semaphore release in a finally.

旁注,只是对其他一些关于将信号量发布放在 finally 中的评论+1。

if (sem.tryAcquire(3000, TimeUnit.MILLISECONDS)) {
    try {
        // do stuff while holding permit    
    } finally {
        sem.release();
    }
}

回答by djna

Are we seeing an interesting manifestation of the Java Memory Model? Under what conditions are registers flushed to main memory? I think it's guaranteed that if two threads synchronize on the same object then they will see a consistent memory view.

我们是否看到了 Java 内存模型的有趣表现?在什么情况下将寄存器刷新到主内存?我认为可以保证,如果两个线程在同一个对象上同步,那么它们将看到一致的内存视图。

I don't know what Semphore does internally, it almost obviously must do some synchronize, but do we know that?

我不知道 Sephore 在内部做了什么,它几乎显然必须做一些同步,但我们知道吗?

What happens if you do

如果你这样做会发生什么

synchronize(dedicatedLockObject)

instead of aquiring the semaphore?

而不是获取信号量?

回答by bigtalktheory

Why are you locking a concurrent hash map? By def. its thread safe. If there's a problem, its in your locking code. That's why we have thread safe packages in Java The best way to debug this is with barrier synchronization.

为什么要锁定并发哈希映射?由定义。它的线程安全。如果有问题,则在您的锁定代码中。这就是我们在 Java 中有线程安全包的原因。调试它的最佳方法是使用屏障同步。