从多个线程的 java.util.HashMap 获取值是否安全(无修改)?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/104184/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 08:19:56  来源:igfitidea点击:

Is it safe to get values from a java.util.HashMap from multiple threads (no modification)?

javamultithreadingconcurrencyhashmap

提问by Dave L.

There is a case where a map will be constructed, and once it is initialized, it will never be modified again. It will however, be accessed (via get(key) only) from multiple threads. Is it safe to use a java.util.HashMapin this way?

有一种情况,地图会被构造,一旦被初始化,就永远不会再被修改。但是,它将从多个线程访问(仅通过 get(key))。java.util.HashMap以这种方式使用 a 是否安全?

(Currently, I'm happily using a java.util.concurrent.ConcurrentHashMap, and have no measured need to improve performance, but am simply curious if a simple HashMapwould suffice. Hence, this question is not"Which one should I use?" nor is it a performance question. Rather, the question is "Would it be safe?")

(目前,我很高兴使用java.util.concurrent.ConcurrentHashMap,并且没有必要提高性能,但我只是很好奇一个简单的HashMap就足够了。因此,这个问题不是“我应该使用哪个?”也不是一个性能问题。相反,问题是“它会安全吗?”)

采纳答案by BeeOnRope

Your idiom is safe if and only ifthe reference to the HashMapis safely published. Rather than anything relating the internals of HashMapitself, safe publicationdeals with how the constructing thread makes the reference to the map visible to other threads.

您的成语是安全的当且仅当该参考HashMap安全发布。与HashMap自身内部相关的任何内容不同,安全发布处理构造线程如何使对映射的引用对其他线程可见。

Basically, the only possible race here is between the construction of the HashMapand any reading threads that may access it before it is fully constructed. Most of the discussion is about what happens to the state of the map object, but this is irrelevant since you never modify it - so the only interesting part is how the HashMapreference is published.

基本上,这里唯一可能的竞争是在 的构造HashMap和在完全构造之前可以访问它的任何读取线程之间。大多数讨论都是关于地图对象的状态会发生什么,但这无关紧要,因为您从不修改它 - 所以唯一有趣的部分是如何HashMap发布引用。

For example, imagine you publish the map like this:

例如,假设您像这样发布地图:

class SomeClass {
   public static HashMap<Object, Object> MAP;

   public synchronized static setMap(HashMap<Object, Object> m) {
     MAP = m;
   }
}

... and at some point setMap()is called with a map, and other threads are using SomeClass.MAPto access the map, and check for null like this:

...在某些时候setMap()用地图调用,其他线程正在使用SomeClass.MAP访问地图,并像这样检查 null:

HashMap<Object,Object> map = SomeClass.MAP;
if (map != null) {
  .. use the map
} else {
  .. some default behavior
}

This is not safeeven though it probably appears as though it is. The problem is that there is no happens-beforerelationship between the set of SomeObject.MAPand the subsequent read on another thread, so the reading thread is free to see a partially constructed map. This can pretty much do anythingand even in practice it does things like put the reading thread into an infinite loop.

并不安全,即使它看起来好像是这样。问题是在另一个线程上的集合和后续读取之间没有发生之前的关系SomeObject.MAP,因此读取线程可以自由地查看部分构造的映射。这几乎可以做任何事情,甚至在实践中它也会做一些事情,比如将阅读线程放入一个无限循环中

To safely publish the map, you need to establish a happens-beforerelationship between the writing of the referenceto the HashMap(i.e., the publication) and the subsequent readers of that reference (i.e., the consumption). Conveniently, there are only a few easy-to-remember ways to accomplishthat[1]:

为了安全地发布地图,您需要建立之前发生的关系的参考书面HashMap(即出版)和引用(即消费)的后续读者。方便的是,只有几个简单的记忆的方式来做到[1]

  1. Exchange the reference through a properly locked field (JLS 17.4.5)
  2. Use static initializer to do the initializing stores (JLS 12.4)
  3. Exchange the reference via a volatile field (JLS 17.4.5), or as the consequence of this rule, via the AtomicX classes
  4. Initialize the value into a final field (JLS 17.5).
  1. 通过正确锁定的字段交换参考 ( JLS 17.4.5)
  2. 使用静态初始化器进行初始化存储(JLS 12.4
  3. 通过 volatile 字段 ( JLS 17.4.5)交换引用,或作为此规则的结果,通过 AtomicX 类
  4. 将值初始化为最终字段 ( JLS 17.5)。

The ones most interesting for your scenario are (2), (3) and (4). In particular, (3) applies directly to the code I have above: if you transform the declaration of MAPto:

对于您的场景最有趣的是 (2)、(3) 和 (4)。特别是,(3)直接适用于我上面的代码:如果您将声明转换MAP为:

public static volatile HashMap<Object, Object> MAP;

then everything is kosher: readers who see a non-nullvalue necessarily have a happens-beforerelationship with the store to MAPand hence see all the stores associated with the map initialization.

那么一切都是 kosher 的:看到非空值的读者必然与商店 to有一个发生之前的关系,MAP因此看到所有与地图初始化相关的商店。

The other methods change the semantics of your method, since both (2) (using the static initalizer) and (4) (using final) imply that you cannot set MAPdynamically at runtime. If you don't needto do that, then just declare MAPas a static final HashMap<>and you are guaranteed safe publication.

其他方法会更改方法的语义,因为 (2)(使用静态初始化器)和 (4)(使用final)都意味着您不能MAP在运行时动态设置。如果您不需要这样做,那么只需声明MAP为 a 就static final HashMap<>可以保证安全发布。

In practice, the rules are simple for safe access to "never-modified objects":

在实践中,安全访问“从未修改的对象”的规则很简单:

If you are publishing an object which is not inherently immutable(as in all fields declared final) and:

如果您发布的对象不是固有不变的(如在所有声明的字段中final)并且:

  • You already can create the object that will be assigned at the moment of declarationa: just use a finalfield (including static finalfor static members).
  • You want to assign the object later, after the reference is already visible: use a volatile fieldb.
  • 您已经可以创建将在声明时分配的对象a:只需使用一个final字段(包括static final静态成员)。
  • 您想稍后分配对象,在引用已经可见之后:使用可变字段b

That's it!

就是这样!

In practice, it is very efficient. The use of a static finalfield, for example, allows the JVM to assume the value is unchanged for the life of the program and optimize it heavily. The use of a finalmember field allows mostarchitectures to read the field in a way equivalent to a normal field read and doesn't inhibit further optimizationsc.

在实践中,它非常有效。static final例如,字段的使用允许 JVM 假设该值在程序的生命周期内保持不变并对其进行大量优化。final成员字段的使用允许大多数体系结构以与正常字段读取等效的方式读取该字段,并且不会抑制进一步优化c

Finally, the use of volatiledoes have some impact: no hardware barrier is needed on many architectures (such as x86, specifically those that don't allow reads to pass reads), but some optimization and reordering may not occur at compile time - but this effect is generally small. In exchange, you actually get more than what you asked for - not only can you safely publish one HashMap, you can store as many more not-modified HashMaps as you want to the same reference and be assured that all readers will see a safely published map.

最后,使用volatile确实有一些影响:在许多体系结构(例如 x86,特别是那些不允许读取传递读取的体系结构)上不需要硬件屏障,但是在编译时可能不会发生一些优化和重新排序 - 但这影响一般较小。作为交换,您实际上得到了比您要求的更多的东西 - 您不仅可以安全地发布一个HashMap,您还可以存储尽可能多的未修改的HashMaps 到相同的参考,并确保所有读者都会看到安全发布的地图.

For more gory details, refer to Shipilevor this FAQ by Manson and Goetz.

有关更多详细信息,请参阅ShipilevManson 和 Goetz 的这个 FAQ



[1] Directly quoting from shipilev.

[1] 直接引用shipilev



aThat sounds complicated, but what I mean is that you can assign the reference at construction time - either at the declaration point or in the constructor (member fields) or static initializer (static fields).

a这听起来很复杂,但我的意思是您可以在构造时分配引用 - 无论是在声明点还是在构造函数(成员字段)或静态初始值设定项(静态字段)中。

bOptionally, you can use a synchronizedmethod to get/set, or an AtomicReferenceor something, but we're talking about the minimum work you can do.

b或者,您可以使用一种synchronized方法来获取/设置,或者一个AtomicReference或其他东西,但我们讨论的是您可以做的最少工作。

c Some architectures with very weak memory models (I'm looking at you, Alpha) may require some type of read barrier before a finalread - but these are very rare today.

c 一些具有非常弱内存模型的架构(我在看着,Alpha)在读取之前可能需要某种类型的读取屏障final- 但这些在今天非常罕见。

回答by FlySwat

http://www.docjar.com/html/api/java/util/HashMap.java.html

http://www.docjar.com/html/api/java/util/HashMap.java.html

here is the source for HashMap. As you can tell, there is absolutely no locking / mutex code there.

这是 HashMap 的来源。如您所知,那里绝对没有锁定/互斥代码。

This means that while its okay to read from a HashMap in a multithreaded situation, I'd definitely use a ConcurrentHashMap if there were multiple writes.

这意味着虽然在多线程情况下可以从 HashMap 读取,但如果有多个写入,我肯定会使用 ConcurrentHashMap。

Whats interesting is that both the .NET HashTable and Dictionary<K,V> have built in synchronization code.

有趣的是 .NET HashTable 和 Dictionary<K,V> 都内置了同步代码。

回答by Dave L.

After a bit more looking, I found this in the java doc(emphasis mine):

多看几眼后,我在java doc(重点是我的)中找到了这个:

Note that this implementation is not synchronized. If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally.(A structural modification is any operation that adds or deletes one or more mappings; merely changing the value associated with a key that an instance already contains is not a structural modification.)

请注意,此实现不是同步的。 如果多个线程并发访问一个散列映射,并且至少有一个线程在结构上修改了映射,则必须在外部进行同步。(结构修改是添加或删除一个或多个映射的任何操作;仅更改与实例已包含的键关联的值不是结构修改。)

This seems to imply that it will be safe, assuming the converse of the statement there is true.

这似乎意味着它将是安全的,假设那里的陈述的逆命题为真。

回答by Steve Jessop

Be warned that even in single-threaded code, replacing a ConcurrentHashMap with a HashMap may not be safe. ConcurrentHashMap forbids null as a key or value. HashMap does not forbid them (don't ask).

请注意,即使在单线程代码中,用 HashMap 替换 ConcurrentHashMap 也可能不安全。ConcurrentHashMap 禁止将 null 作为键或值。HashMap 不禁止它们(不要问)。

So in the unlikely situation that your existing code might add a null to the collection during setup (presumably in a failure case of some kind), replacing the collection as described will change the functional behaviour.

因此,在不太可能的情况下,您现有的代码可能会在设置期间向集合添加空值(大概是在某种失败情况下),按照描述替换集合将改变功能行为。

That said, provided you do nothing else concurrent reads from a HashMap are safe.

也就是说,只要你不做任何其他事情,从 HashMap 并发读取是安全的。

[Edit: by "concurrent reads", I mean that there are not also concurrent modifications.

[编辑:通过“并发读取”,我的意思是也没有并发修改。

Other answers explain how to ensure this. One way is to make the map immutable, but it's not necessary. For example, the JSR133 memory model explicitly defines starting a thread to be a synchronised action, meaning that changes made in thread A before it starts thread B are visible in thread B.

其他答案解释了如何确保这一点。一种方法是使地图不可变,但这不是必需的。例如,JSR133 内存模型明确定义启动线程为同步操作,这意味着在线程 A 启动线程 B 之前在线程 B 中所做的更改在线程 B 中可见。

My intent is not to contradict those more detailed answers about the Java Memory Model. This answer is intended to point out that even aside from concurrency issues, there is at least one API difference between ConcurrentHashMap and HashMap, which could scupper even a single-threaded program which replaced one with the other.]

我的目的不是与那些关于 Java 内存模型的更详细的答案相矛盾。这个答案旨在指出,即使除了并发问题之外,ConcurrentHashMap 和 HashMap 之间至少存在一个 API 差异,这甚至可能会破坏一个用另一个替换的单线程程序。]

回答by Taylor Gautier

Jeremy Manson, the god when it comes to the Java Memory Model, has a three part blog on this topic - because in essence you are asking the question "Is it safe to access an immutable HashMap" - the answer to that is yes. But you must answer the predicate to that question which is - "Is my HashMap immutable". The answer might surprise you - Java has a relatively complicated set of rules to determine immutability.

杰里米·曼森 (Jeremy Manson) 是 Java 内存模型的大神,他有一篇关于这个主题的三部分博客——因为本质上你在问“访问不可变的 HashMap 是否安全”这个问题——答案是肯定的。但是你必须回答这个问题的谓词——“我的 HashMap 是不可变的”。答案可能会让您感到惊讶 - Java 有一组相对复杂的规则来确定不变性。

For more info on the topic, read Jeremy's blog posts:

有关该主题的更多信息,请阅读 Jeremy 的博客文章:

Part 1 on Immutability in Java: http://jeremymanson.blogspot.com/2008/04/immutability-in-java.html

关于 Java 不变性的第 1 部分:http: //jeremymanson.blogspot.com/2008/04/immutability-in-java.html

Part 2 on Immutability in Java: http://jeremymanson.blogspot.com/2008/07/immutability-in-java-part-2.html

关于 Java 不变性的第 2 部分:http: //jeremymanson.blogspot.com/2008/07/immutability-in-java-part-2.html

Part 3 on Immutability in Java: http://jeremymanson.blogspot.com/2008/07/immutability-in-java-part-3.html

关于 Java 不变性的第 3 部分:http: //jeremymanson.blogspot.com/2008/07/immutability-in-java-part-3.html

回答by Heath Borders

The reads are safe from a synchronization standpoint but not a memory standpoint. This is something that is widely misunderstood among Java developers including here on Stackoverflow. (Observe the rating of this answerfor proof.)

从同步的角度来看,读取是安全的,但从内存的角度来看则不是。这在 Java 开发人员中被广泛误解,包括在 Stackoverflow 上。(观察此答案的评分以获取证据。)

If you have other threads running, they may not see an updated copy of the HashMap if there is no memory write out of the current thread. Memory writes occur through the use of the synchronized or volatile keywords, or through uses of some java concurrency constructs.

如果您有其他线程在运行,并且当前线程没有内存写入,则它们可能看不到 HashMap 的更新副本。内存写入是通过使用 synchronized 或 volatile 关键字,或通过使用某些 java 并发构造而发生的。

See Brian Goetz's article on the new Java Memory Modelfor details.

有关详细信息,请参阅Brian Goetz 关于新 Java 内存模型的文章

回答by Alexander

There is an important twist though. It's safe to access the map, but in general it's not guaranteed that all threads will see exactly the same state (and thus values) of the HashMap. This might happen on multiprocessor systems where the modifications to the HashMap done by one thread (e.g., the one that populated it) can sit in that CPU's cache and won't be seen by threads running on other CPUs, until a memory fence operation is performed ensuring cache coherence. The Java Language Specification is explicit on this one: the solution is to acquire a lock (synchronized (...)) which emits a memory fence operation. So, if you are sure that after populating the HashMap each of the threads acquires ANY lock, then it's OK from that point on to access the HashMap from any thread until the HashMap is modified again.

不过有一个重要的转折。访问映射是安全的,但通常不能保证所有线程都会看到 HashMap 的完全相同的状态(以及值)。这可能发生在多处理器系统上,其中一个线程(例如,填充它的那个)对 HashMap 所做的修改可以位于该 CPU 的缓存中,并且不会被其他 CPU 上运行的线程看到,直到内存栅栏操作执行确保缓存一致性。Java 语言规范对此进行了明确说明:解决方案是获取发出内存栅栏操作的锁(同步 (...))。因此,如果您确定在填充 HashMap 之后,每个线程都获得了 ANY 锁,那么从那时起可以从任何线程访问 HashMap,直到再次修改 HashMap。

回答by Alex Miller

One note is that under some circumstances, a get() from an unsynchronized HashMap can cause an infinite loop. This can occur if a concurrent put() causes a rehash of the Map.

需要注意的是,在某些情况下,来自未同步 HashMap 的 get() 可能会导致无限循环。如果并发 put() 导致 Map 重新散列,则可能会发生这种情况。

http://lightbody.net/blog/2005/07/hashmapget_can_cause_an_infini.html

http://lightbody.net/blog/2005/07/hashmapget_can_cause_an_infini.html

回答by Will

So the scenario you described is that you need to put a bunch of data into a Map, then when you're done populating it you treat it as immutable. One approach that is "safe" (meaning you're enforcing that it really is treated as immutable) is to replace the reference with Collections.unmodifiableMap(originalMap)when you're ready to make it immutable.

因此,您描述的场景是您需要将一堆数据放入 Map 中,然后在完成填充后将其视为不可变的。一种“安全”的方法(意味着您强制执行它确实被视为不可变的)是Collections.unmodifiableMap(originalMap)在您准备使其不可变时替换引用。

For an example of how badly maps can fail if used concurrently, and the suggested workaround I mentioned, check out this bug parade entry: bug_id=6423457

有关如果同时使用地图会失败的严重程度的示例以及我提到的建议解决方法,请查看此错误游行条目:bug_id=6423457

回答by bodrin

According to http://www.ibm.com/developerworks/java/library/j-jtp03304/# Initialization safety you can make your HashMap a final field and after the constructor finishes it would be safely published.

根据http://www.ibm.com/developerworks/java/library/j-jtp03304/# 初始化安全,您可以使您的 HashMap 成为最终字段,并且在构造函数完成后它将被安全发布。

... Under the new memory model, there is something similar to a happens-before relationship between the write of a final field in a constructor and the initial load of a shared reference to that object in another thread. ...

... 在新的内存模型下,在构造函数中写入 final 字段与在另一个线程中对该对象的共享引用的初始加载之间存在类似于发生之前的关系。...

回答by TomWolk

If the initialization and every put is synchronized you are save.

如果初始化和每次放置都是同步的,您就可以保存。

Following code is save because the classloader will take care of the synchronization:

以下代码被保存,因为类加载器将负责同步:

public static final HashMap<String, String> map = new HashMap<>();
static {
  map.put("A","A");

}

Following code is save because the writing of volatile will take care of the synchronization.

以下代码被保存,因为 volatile 的写入将负责同步。

class Foo {
  volatile HashMap<String, String> map;
  public void init() {
    final HashMap<String, String> tmp = new HashMap<>();
    tmp.put("A","A");
    // writing to volatile has to be after the modification of the map
    this.map = tmp;
  }
}

This will also work if the member variable is final because final is also volatile. And if the method is a constructor.

如果成员变量是 final,这也将起作用,因为 final 也是 volatile。如果该方法是一个构造函数。