java 哈希码唯一性
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1381060/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
hashCode uniqueness
提问by Eleco
Is it possible for two instances of Objectto have the same hashCode()?
两个实例Object有可能相同hashCode()吗?
In theory an object's hashCodeis derived from its memory address, so all hashCodesshould be unique, but what if objects are moved around during GC?
理论上一个对象hashCode是从它的内存地址派生出来的,所以所有的都hashCodes应该是唯一的,但是如果对象在 GC 期间四处移动呢?
采纳答案by Tom Hawtin - tackline
Given a reasonable collection of objects, having two with the same hash code is quite likely. In the best case it becomes the birthday problem, with a clash with tens of thousands of objects. In practice objects a created with a relatively small pool of likely hash codes, and clashes can easily happen with merely thousands of objects.
给定合理的对象集合,很可能有两个对象具有相同的哈希码。在最好的情况下,它变成了生日问题,与数以万计的对象发生冲突。在实践中,对象 a 使用相对较小的可能哈希码池创建,并且仅数千个对象就很容易发生冲突。
Using memory address is just a way of obtaining a slightly random number. The Sun JDK source has a switch to enable use of a Secure Random Number Generator or a constant. I believe IBM (used to?) use a fast random number generator, but it was not at all secure. The mention in the docs of memory address appears to be of a historical nature (around a decade ago it was not unusual to have object handles with fixed locations).
使用内存地址只是一种获得稍微随机数的方法。Sun JDK 源代码有一个开关来启用安全随机数生成器或常量的使用。我相信 IBM(曾经?)使用快速随机数生成器,但它根本不安全。文档中提到的内存地址似乎具有历史性质(大约十年前,具有固定位置的对象句柄并不罕见)。
Here's some code I wrote a few years ago to demonstrate clashes:
这是我几年前写的一些代码来演示冲突:
class HashClash {
public static void main(String[] args) {
final Object obj = new Object();
final int target = obj.hashCode();
Object clash;
long ct = 0;
do {
clash = new Object();
++ct;
} while (clash.hashCode() != target && ct<10L*1000*1000*1000L);
if (clash.hashCode() == target) {
System.out.println(ct+": "+obj+" - "+clash);
} else {
System.out.println("No clashes found");
}
}
}
RFE to clarify docs, because this comes up way too frequently: CR 6321873
RFE 澄清文档,因为这出现的频率太高:CR 6321873
回答by RichardOD
I think the docs for object's hashCode methodstate the answer.
我认为对象的 hashCode 方法的文档说明了答案。
"As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)"
“在合理可行的情况下,类 Object 定义的 hashCode 方法确实为不同的对象返回不同的整数。(这通常通过将对象的内部地址转换为整数来实现,但 JavaTM 不需要这种实现技术编程语言。)”
回答by erickson
Think about it. There are an infinite number of potential objects, and only 4 billion hash codes. Clearly, an infinity of potential objects share each hash code.
想想看。潜在对象的数量是无限的,而哈希码只有 40 亿个。显然,无数潜在对象共享每个哈希码。
The Sun JVM either bases the Objecthash code on a stable handle to the object or caches the initial hash code. Compaction during GC will not alter the hashCode(). Everything would break if it did.
Sun JVM 将Object散列代码建立在对象的稳定句柄上,或者缓存初始散列代码。GC 期间的压缩不会改变hashCode(). 如果这样做,一切都会破裂。
回答by GWLlosa
Is it possible?
是否可以?
Yes.
是的。
Does it happen with any reasonable degree of frequency?
它是否以任何合理的频率发生?
No.
不。
回答by Guss
I assume the original question is only about the hash codes generated by the default Objectimplementation. The fact is that hash codes must not be relied on for equality testing and are only used in some specific hash mapping operations (such as those implemented by the very useful HashMapimplementation).
我假设原始问题仅与默认Object实现生成的哈希码有关。事实是,不能依赖哈希码进行相等性测试,只能在某些特定的哈希映射操作中使用(例如由非常有用的HashMap实现实现的那些)。
As such they have no need of being really unique - they only have to be unique enough to not generate a lot of clashes (which will render the HashMapimplementation inefficient).
因此,它们不需要真正独特——它们只需要足够独特,不会产生大量冲突(这会使HashMap实现效率低下)。
Also it is expected that when developer implement classes that are meant to be stored in HashMaps they will implement a hash code algorithm that has a low chance of clashes for objects of the same class (assuming you only store objects of the same class in application HashMaps), and knowing about the data makes it much easier to implement robust hashing.
此外,预计当开发人员实现旨在存储在 HashMaps 中的类时,他们将实现一种哈希码算法,该算法对同一类的对象发生冲突的可能性很小(假设您只在应用程序 HashMaps 中存储同一类的对象) ),并且了解数据可以更容易地实现健壮的散列。
Also see Ken's answer about equality necessitating identical hash codes.
另请参阅 Ken 关于需要相同哈希码的相等性的回答。
回答by Ken
Are you talking about the actual class Objector objects in general? You use both in the question. (And real-world apps generally don't create a lot of instances of Object)
你是在谈论实际的类Object还是一般的对象?你在问题中使用了两者。(并且现实世界的应用程序通常不会创建很多 实例Object)
For objects in general, it is common to write a class for which you want to override equals(); and if you do that, you must also override hashCode()so that two different instances of that class that are "equal" must also have the same hash code. You are likely to get a "duplicate" hash code in that case, among instances of the same class.
对于一般的对象,编写一个要覆盖的类是很常见的equals();如果您这样做,您还必须覆盖,hashCode()以便该类的两个“相等”的不同实例也必须具有相同的哈希码。在这种情况下,您可能会在同一类的实例中获得“重复”的哈希码。
Also, when implementing hashCode()in different classes, they are often based on something in the object, so you end up with less "random" values, resulting in "duplicate" hash codes among instances of different classes (whether or not those objects are "equal").
此外,hashCode()在不同的类中实现时,它们通常基于对象中的某些内容,因此最终得到的“随机”值较少,导致不同类的实例之间出现“重复”哈希码(无论这些对象是否“相等”) ”)。
In any real-world app, it is not unusual to find to different objects with the same hash code.
在任何现实世界的应用程序中,找到具有相同哈希码的不同对象并不罕见。
回答by P Shved
If there were as many hashcodes as memory addresses, then it would took the wholememory to store the hash itself. :-)
如果哈希码和内存地址一样多,那么它会占用整个内存来存储哈希本身。:-)
So, yes, the hash codes should sometimes happen to coincide.
所以,是的,哈希码有时应该碰巧重合。

