Java中的hashCode()是如何计算的
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2427631/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How is hashCode() calculated in Java
提问by Jothi
What value does the hashCode()
method return in java?
hashCode()
java中的方法返回什么值?
I read that it is a memory reference of an object... The hash value for new Integer(1)
is 1; the hash value for String("a")
is 97.
我读到它是一个对象的内存引用......的哈希值为new Integer(1)
1; 的哈希值为String("a")
97。
I am confused: is it ASCII or what type of value is?
我很困惑:它是 ASCII 还是什么类型的值?
采纳答案by Andrew Hare
A hashcode is an integer value that represents the state of the object upon which it was called. That is why an Integer
that is set to 1 will return a hashcode of "1" because an Integer's
hashcode and its value are the same thing. A character's hashcode is equal to it's ASCII character code. If you write a custom type you are responsible for creating a good hashCode
implementation that will best represent the state of the current instance.
哈希码是一个整数值,表示调用它的对象的状态。这就是为什么Integer
设置为 1 将返回哈希码“1”的原因,因为Integer's
哈希码及其值是相同的。一个字符的哈希码等于它的 ASCII 字符码。如果您编写自定义类型,则您有责任创建hashCode
最能代表当前实例状态的良好实现。
回答by danben
The value returned by hashCode()
is by no means guaranteed to be the memory address of the object. I'm not sure of the implementation in the Object
class, but keep in mind most classes will override hashCode()
such that two instances that are semantically equivalent (but are not the same instance) will hash to the same value. This is especially important if the classes may be used within another data structure, such as Set, that relies on hashCode
being consistent with equals
.
返回的值hashCode()
绝不保证是对象的内存地址。我不确定类中的实现Object
,但请记住,大多数类将覆盖,hashCode()
这样语义上等效(但不是同一个实例)的两个实例将散列到相同的值。如果类可以在依赖于hashCode
与equals
.
There is no hashCode()
that uniquely identifies an instance of an object no matter what. If you want a hashcode based on the underlying pointer (e.g. in Sun's implementation), use System.identityHashCode()
- this will delegate to the default hashCode
method regardless of whether it has been overridden.
无论如何hashCode()
,都没有唯一标识对象实例的方法。如果您想要基于底层指针的哈希码(例如在 Sun 的实现中),请使用System.identityHashCode()
- 这将委托给默认hashCode
方法,无论它是否已被覆盖。
Nevertheless, even System.identityHashCode()
can return the same hash for multiple objects. See the comments for an explanation, but here is an example program that continuously generates objects until it finds two with the same System.identityHashCode()
. When I run it, it quickly finds two System.identityHashCode()
s that match, on average after adding about 86,000 Long wrapper objects (and Integer wrappers for the key) to a map.
尽管如此, evenSystem.identityHashCode()
可以为多个对象返回相同的散列。请参阅注释以获取解释,但这里是一个示例程序,它不断生成对象,直到找到两个具有相同System.identityHashCode()
. 当我运行它时,System.identityHashCode()
在向映射添加大约 86,000 个 Long 包装器对象(以及键的 Integer 包装器)后,它会迅速找到两个匹配的 s 。
public static void main(String[] args) {
Map<Integer,Long> map = new HashMap<>();
Random generator = new Random();
Collection<Integer> counts = new LinkedList<>();
Long object = generator.nextLong();
// We use the identityHashCode as the key into the map
// This makes it easier to check if any other objects
// have the same key.
int hash = System.identityHashCode(object);
while (!map.containsKey(hash)) {
map.put(hash, object);
object = generator.nextLong();
hash = System.identityHashCode(object);
}
System.out.println("Identical maps for size: " + map.size());
System.out.println("First object value: " + object);
System.out.println("Second object value: " + map.get(hash));
System.out.println("First object identityHash: " + System.identityHashCode(object));
System.out.println("Second object identityHash: " + System.identityHashCode(map.get(hash)));
}
Example output:
示例输出:
Identical maps for size: 105822
First object value: 7446391633043190962
Second object value: -8143651927768852586
First object identityHash: 2134400190
Second object identityHash: 2134400190
回答by alexvetter
The hashCode()
method is often used for identifying an object. I think the Object
implementation returns the pointer (not a real pointer but a unique id or something like that) of the object. But most classes override the method. Like the String
class. Two String objects have not the same pointer but they are equal:
该hashCode()
方法通常用于识别对象。我认为Object
实现返回对象的指针(不是真正的指针,而是唯一的 id 或类似的东西)。但是大多数类会覆盖该方法。喜欢String
上课。两个 String 对象没有相同的指针但它们是相等的:
new String("a").hashCode() == new String("a").hashCode()
I think the most common use for hashCode()
is in Hashtable
, HashSet
, etc..
我认为最常见的用途hashCode()
是Hashtable
,HashSet
等。
Edit:(due to a recent downvote and based on an article I read about JVM parameters)
编辑:(由于最近的投票反对,并且基于我阅读的有关 JVM 参数的文章)
With the JVM parameter -XX:hashCode
you can change the way how the hashCode is calculated (see the Issue 222of the Java Specialists' Newsletter).
使用 JVM 参数,-XX:hashCode
您可以更改 hashCode 的计算方式(请参阅Java 专家时事通讯的第 222期)。
HashCode==0: Simply returns random numbers with no relation to where in memory the object is found. As far as I can make out, the global read-write of the seed is not optimal for systems with lots of processors.
HashCode==1: Counts up the hash code values, not sure at what value they start, but it seems quite high.
HashCode==2: Always returns the exact same identity hash code of 1. This can be used to test code that relies on object identity. The reason why JavaChampionTest returned Kirk's URL in the example above is that all objects were returning the same hash code.
HashCode==3: Counts up the hash code values, starting from zero. It does not look to be thread safe, so multiple threads could generate objects with the same hash code.
HashCode==4: This seems to have some relation to the memory location at which the object was created.
HashCode>=5: This is the default algorithm for Java 8 and has a per-thread seed. It uses Marsaglia's xor-shift scheme to produce pseudo-random numbers.
HashCode==0:简单地返回与对象在内存中的位置无关的随机数。据我所知,种子的全局读写对于具有大量处理器的系统来说并不是最佳的。
HashCode==1:计算哈希码值,不确定它们从什么值开始,但似乎相当高。
HashCode==2:始终返回与 1 完全相同的身份哈希码。这可用于测试依赖于对象身份的代码。JavaChampionTest 在上例中返回 Kirk 的 URL 的原因是所有对象都返回相同的哈希码。
HashCode==3:从零开始计算哈希码值。它看起来不是线程安全的,因此多个线程可以生成具有相同哈希码的对象。
HashCode==4:这似乎与创建对象的内存位置有某种关系。
HashCode>=5:这是 Java 8 的默认算法,并且具有每线程种子。它使用 Marsaglia 的 xor-shift 方案来产生伪随机数。
回答by Ben Fowler
Object.hashCode(), if memory serves correctly (check the JavaDoc for java.lang.Object), is implementation-dependent, and will change depending on the object (the Sun JVM derives the value from the value of the reference to the object).
Object.hashCode(),如果内存服务正确(检查 java.lang.Object 的 JavaDoc),是依赖于实现的,并且会根据对象而改变(Sun JVM 从对对象的引用的值中获得值)。
Note that if you are implementing any nontrivial object, and want to correctly store them in a HashMap or HashSet, you MUST override hashCode() and equals(). hashCode() can do whatever you like (it's entirely legal, but suboptimal to have it return 1.), but it's vital that if your equals() method returns true, then the value returned by hashCode() for both objects are equal.
请注意,如果您正在实现任何非平凡对象,并希望将它们正确存储在 HashMap 或 HashSet 中,您必须覆盖 hashCode() 和 equals()。hashCode() 可以做任何你喜欢的事情(它完全合法,但让它返回 1 是次优的。),但至关重要的是,如果你的 equals() 方法返回 true,那么 hashCode() 为两个对象返回的值是相等的。
Confusion and lack of understanding of hashCode() and equals() is a big source of bugs. Make sure that you thoroughly familiarize yourself with the JavaDocs for Object.hashCode() and Object.equals(), and I guarantee that the time spent will pay for itself.
对 hashCode() 和 equals() 的混淆和缺乏理解是错误的重要来源。确保您完全熟悉 Object.hashCode() 和 Object.equals() 的 JavaDocs,我保证所花费的时间会物有所值。
回答by user207421
I read that it is an memory reference of an object..
我读到它是一个对象的内存引用..
No. Object.hashCode()
used to return a memory address about 14 years ago. Not since.
No.Object.hashCode()
用于返回大约 14 年前的内存地址。不是从那以后。
what type of value is
什么类型的值
What it is depends entirely on what class you're talking about and whether or not it has overridden `Object.hashCode().
它是什么完全取决于您所谈论的类以及它是否覆盖了`Object.hashCode()。
回答by Peter Lawrey
If you want to know how they are implmented, I suggest you read the source. If you are using an IDE you can just + on a method you are interested in and see how a method is implemented. If you cannot do that, you can google for the source.
如果你想知道它们是如何实现的,我建议你阅读源代码。如果您使用的是 IDE,您只需在您感兴趣的方法上添加 +,然后查看该方法是如何实现的。如果你不能这样做,你可以谷歌寻找来源。
For example, Integer.hashCode() is implemented as
例如,Integer.hashCode() 实现为
public int hashCode() {
return value;
}
and String.hashCode()
和 String.hashCode()
public int hashCode() {
int h = hash;
if (h == 0) {
int off = offset;
char val[] = value;
int len = count;
for (int i = 0; i < len; i++) {
h = 31*h + val[off++];
}
hash = h;
}
return h;
}
回答by arjun kumar mehta
public static int murmur3_32(int paramInt1, char[] paramArrayOfChar, int paramInt2, int paramInt3) {
/* 121 */ int i = paramInt1;
/* */
/* 123 */ int j = paramInt2;
/* 124 */ int k = paramInt3;
/* */
/* */ int m;
/* 127 */ while (k >= 2) {
/* 128 */ m = paramArrayOfChar[(j++)] & 0xFFFF | paramArrayOfChar[(j++)] << '0';
/* */
/* 130 */ k -= 2;
/* */
/* 132 */ m *= -862048943;
/* 133 */ m = Integer.rotateLeft(m, 15);
/* 134 */ m *= 461845907;
/* */
/* 136 */ i ^= m;
/* 137 */ i = Integer.rotateLeft(i, 13);
/* 138 */ i = i * 5 + -430675100;
/* */ }
/* */
/* */
/* */
/* 143 */ if (k > 0) {
/* 144 */ m = paramArrayOfChar[j];
/* */
/* 146 */ m *= -862048943;
/* 147 */ m = Integer.rotateLeft(m, 15);
/* 148 */ m *= 461845907;
/* 149 */ i ^= m;
/* */ }
/* */
/* */
/* */
/* 154 */ i ^= paramInt3 * 2;
/* */
/* */
/* 157 */ i ^= i >>> 16;
/* 158 */ i *= -2048144789;
/* 159 */ i ^= i >>> 13;
/* 160 */ i *= -1028477387;
/* 161 */ i ^= i >>> 16;
/* */
/* 163 */ return i;
/* */ }
If you really curious to learn then go through this code available in Hashing.class ;
如果您真的很想学习,请阅读 Hashing.class 中提供的此代码;
Here first param HASHING_SEEDis calculated based on below code
这里第一个参数HASHING_SEED是根据以下代码计算的
{
long nanos = System.nanoTime();
long now = System.currentTimeMillis();
int SEED_MATERIAL[] = {
System.identityHashCode(String.class),
System.identityHashCode(System.class),
(int) (nanos >>> 32),
(int) nanos,
(int) (now >>> 32),
(int) now,
(int) (System.nanoTime() >>> 2)
};
// Use murmur3 to scramble the seeding material.
// Inline implementation to avoid loading classes
int h1 = 0;
// body
for (int k1 : SEED_MATERIAL) {
k1 *= 0xcc9e2d51;
k1 = (k1 << 15) | (k1 >>> 17);
k1 *= 0x1b873593;
h1 ^= k1;
h1 = (h1 << 13) | (h1 >>> 19);
h1 = h1 * 5 + 0xe6546b64;
}
// tail (always empty, as body is always 32-bit chunks)
// finalization
h1 ^= SEED_MATERIAL.length * 4;
// finalization mix force all bits of a hash block to avalanche
h1 ^= h1 >>> 16;
h1 *= 0x85ebca6b;
h1 ^= h1 >>> 13;
h1 *= 0xc2b2ae35;
h1 ^= h1 >>> 16;
HASHING_SEED = h1;
}
the second param is char array of String , third is always '0' and fourth one is char array length.
第二个参数是 String 的字符数组,第三个总是'0',第四个是字符数组长度。
And the above calculation is just for String hash code.
而上面的计算只是针对String哈希码。
For all integer, its hash code will be its integer value. For char(up to two letter) it will be ASCII code.
对于所有整数,其哈希码将是其整数值。对于字符(最多两个字母),它将是 ASCII 代码。
回答by Sam
As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the Java? programming language.)
尽可能实用,类 Object 定义的 hashCode 方法确实为不同的对象返回不同的整数。(这通常是通过将对象的内部地址转换为整数来实现的,但 Java? 编程语言不需要这种实现技术。)
https://docs.oracle.com/javase/8/docs/api/java/lang/Object.html#hashCode--
https://docs.oracle.com/javase/8/docs/api/java/lang/Object.html#hashCode--
回答by Jordan Sheinfeld
From OpenJDK sources (JDK8):
来自 OpenJDK 源代码 (JDK8):
Use default of 5 to generate hash codes:
使用默认值 5 生成哈希码:
product(intx, hashCode, 5,
"(Unstable) select hashCode generation algorithm")
Some constant data and a random generated number with a seed initiated per thread:
一些常量数据和随机生成的数字,每个线程启动一个种子:
// thread-specific hashCode stream generator state - Marsaglia shift-xor form
_hashStateX = os::random() ;
_hashStateY = 842502087 ;
_hashStateZ = 0x8767 ; // (int)(3579807591LL & 0xffff) ;
_hashStateW = 273326509 ;
Then, this function creates the hashCode (defaulted to 5 as specified above):
然后,此函数创建 hashCode(如上指定的默认为 5):
static inline intptr_t get_next_hash(Thread * Self, oop obj) {
intptr_t value = 0 ;
if (hashCode == 0) {
// This form uses an unguarded global Park-Miller RNG,
// so it's possible for two threads to race and generate the same RNG.
// On MP system we'll have lots of RW access to a global, so the
// mechanism induces lots of coherency traffic.
value = os::random() ;
} else
if (hashCode == 1) {
// This variation has the property of being stable (idempotent)
// between STW operations. This can be useful in some of the 1-0
// synchronization schemes.
intptr_t addrBits = cast_from_oop<intptr_t>(obj) >> 3 ;
value = addrBits ^ (addrBits >> 5) ^ GVars.stwRandom ;
} else
if (hashCode == 2) {
value = 1 ; // for sensitivity testing
} else
if (hashCode == 3) {
value = ++GVars.hcSequence ;
} else
if (hashCode == 4) {
value = cast_from_oop<intptr_t>(obj) ;
} else {
// Marsaglia's xor-shift scheme with thread-specific state
// This is probably the best overall implementation -- we'll
// likely make this the default in future releases.
unsigned t = Self->_hashStateX ;
t ^= (t << 11) ;
Self->_hashStateX = Self->_hashStateY ;
Self->_hashStateY = Self->_hashStateZ ;
Self->_hashStateZ = Self->_hashStateW ;
unsigned v = Self->_hashStateW ;
v = (v ^ (v >> 19)) ^ (t ^ (t >> 8)) ;
Self->_hashStateW = v ;
value = v ;
}
value &= markOopDesc::hash_mask;
if (value == 0) value = 0xBAD ;
assert (value != markOopDesc::no_hash, "invariant") ;
TEVENT (hashCode: GENERATE) ;
return value;
}
So we can see that at least in JDK8 the default is set to random thread specific.
所以我们可以看到至少在 JDK8 中默认设置为随机线程特定。