Java 中的散列键

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13208108/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-31 11:52:56  来源:igfitidea点击:

Hashing Keys in Java

javahashmaphashtablestring-hashing

提问by user1785771

In java, when I use a String as a key for Hashmap I get a little different result than when I use the string hashcode as a key in the HashMap.

在 java 中,当我使用 String 作为 Hashmap 的键时,我得到的结果与使用字符串 hashcode 作为 HashMap 中的键时的结果略有不同。

Any insight?

任何见解?

回答by Jon Skeet

when I use the string hashcode as a key in the HashMap.

当我使用字符串哈希码作为 HashMap 中的键时。

You mustn'tuse the hash code itself as the key. Hash codes aren't intended to be unique - it's entirely permitted for two non-equal values to have the same hash code. You should use the string itselfas a key. The map will then compare hash codes first (to narrow down the candidate matches quickly) and then compare with equalsfor genuinestring equality.

不得使用哈希码本身作为密钥。哈希码并不是唯一的 - 两个不相等的值具有相同的哈希码是完全允许的。您应该使用字符串本身作为键。然后,地图会比较哈希码第一(以缩小候选匹配快速),然后用比较equals真正的字符串相等。

Of course, that's assuming your code really is as your question makes it, e.g.

当然,这是假设你的代码真的是你的问题,例如

HashMap<String, String> goodMap = new HashMap<String, String>();
goodMap.put("foo", "bar");

HashMap<Integer, String> badMap = new HashMap<Integer, String>();
badMap.put("foo".hashCode(), "bar");

If that's really what your code looks like, just use HashMap<String, String>instead.

如果这确实是您的代码的样子,请HashMap<String, String>改用。

From the docs for Object.hashCode()(emphasis mine):

来自Object.hashCode()(强调我的)的文档:

The general contract of hashCode is:

  • Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
  • If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
  • It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results.However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.

hashCode 的总合约为:

  • 每当在 Java 应用程序执行期间在同一对象上多次调用它时,hashCode 方法必须始终返回相同的整数,前提是对象上的 equals 比较中使用的信息没有被修改。该整数不需要从应用程序的一次执行到同一应用程序的另一次执行保持一致。
  • 如果根据 equals(Object) 方法两个对象相等,则对两个对象中的每一个调用 hashCode 方法必须产生相同的整数结果。
  • 如果根据 equals(java.lang.Object) 方法两个对象不相等,则不需要对两个对象中的每一个调用 hashCode 方法必须产生不同的整数结果。但是,程序员应该意识到为不相等的对象生成不同的整数结果可能会提高哈希表的性能。

回答by JB Nizet

Of course. Different Strings can have the same hashCode, so if you store two such strings as keys in a map, you'll have two entries (since the strings are different). Whareas if you use their hashCode as the key, you'll have only one entry (since their hashCode is the same).

当然。不同的字符串可以具有相同的哈希码,因此如果您将两个这样的字符串作为键存储在映射中,您将有两个条目(因为字符串不同)。如果您使用它们的 hashCode 作为键,那么您将只有一个条目(因为它们的 hashCode 是相同的)。

The hashCode isn't used to tell if two keys are equal. It's only used to assign a bucket to the key. Once the bucket is found, every key contained in the bucket is compared to the new key with equals, and the key is added to the bucket if no equal key can be found.

hashCode 不用于判断两个键是否相等。它仅用于为键分配一个桶。一旦找到桶,桶中包含的每个键都与具有相等的新键进行比较,如果找不到相等的键,则将键添加到桶中。

回答by Rohit Jain

The problem is that, even if two objects are different, doesn't mean that their hashcodes are also different.

问题是,即使两个对象不同,也不意味着它们的哈希码也不同。

Two different objects can share the same hashcode. So, you shouldn't have them as a HashMap key.

两个不同的对象可以共享相同的哈希码。因此,您不应该将它们作为 HashMap 键。

Also, because hash codes returned from Object.hashCode()method are of type int, you can only have 2^32different values. That's why you will have "collisions" depending on the hashing algorithm, for different objects.

另外,因为从Object.hashCode()method返回的哈希码是 type int,所以你只能有2^32不同的值。这就是为什么根据散列算法,对于不同的对象,您将有“冲突”的原因。

In short: -

简而言之: -

!obj.equals(obj1)doesn't ensures that obj.hashCode() != obj1.hashCode().

!obj.equals(obj1)并不能确保obj.hashCode() != obj1.hashCode()

回答by Atif Imran

HashCodescan be same or different for same String so be careful with that. May be this is why you are getting a different result.

Here's another SO questionon it. See Jon Skeet's accepted answer.

HashCodes同一个字符串可以相同或不同,所以要小心。可能这就是为什么你得到不同的结果。

这是关于它的另一个 SO 问题。请参阅 Jon Skeet 接受的答案。

回答by Floris

You canuse the hash code as the key only if the hash function is a perfect hash (see e.g.GPERF). As long as your key objects don't reside in memory you are correct that you will save memory.

只有当散列函数是完美散列时,您才可以使用散列代码作为键(参见例如GPERF)。只要您的关键对象不驻留在内存中,您就可以节省内存。