java String.hashCode() 的 int 值是唯一的吗?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25736486/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 08:34:01  来源:igfitidea点击:

is the int value of String.hashCode() unique?

javastringuniquehashcode

提问by congsg2014

I encountered a problem days ago.Now i have tens of millions of words,type of string. now i decide to keep them in database and use index to keep them unique.And i do not want to compare the original words to keep them unique. I would like to make sure whether the hashCode() method of a string can be unique , will it not be changed if a use another laptop or different time or something like that?

几天前我遇到了一个问题。现在我有数千万字,字符串类型。现在我决定将它们保存在数据库中并使用索引来保持它们的唯一性。我不想比较原始单词以保持它们的唯一性。我想确定一个字符串的 hashCode() 方法是否可以是唯一的,如果使用另一台笔记本电脑或不同的时间或类似的东西,它不会改变吗?

回答by paxdiablo

Unique, no. By nature, hash values are not guaranteed to be unique.

独特的,没有。本质上,哈希值不能保证是唯一的。

Any system with an arbitrarily large number of possible inputs and a limited number of outputs will have collisions.

任何具有任意大量可能输入和有限数量输出的系统都会发生冲突。

So, you won't be able to use a uniquedatabase key to store them if it's based onlyon the hash code. You can, however, use a non-unique key to store them.

因此,如果基于哈希码,您将无法使用唯一的数据库密钥来存储它们。但是,您可以使用非唯一密钥来存储它们。

In reply to your second question about whether different versions of Java will generate different hash codes for the same string, no.

回答你的第二个问题,不同版本的Java是否会为同一个字符串生成不同的哈希码,不会。

Provided a Java implementation follows the Oracle documentation (otherwise it's not really a Java implementation), it will be consistent across all implementations. The Oracle docs for String.hashCodespecify a fixed formula for calculation the hash:

如果 Java 实现遵循 Oracle 文档(否则它不是真正的 Java 实现),它将在所有实现中保持一致。用于String.hashCode指定计算散列的固定公式的Oracle 文档

s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]

You may want to check this is still the case if you're using wildly disparate versionsof Java (such as 1.2 vs 8) but it's been like that for a long time, at least since 1.5.

如果您使用的是完全不同的 Java版本(例如 1.2 与 8),您可能想检查一下情况是否仍然如此,但这种情况已经存在很长时间了,至少从 1.5 开始是这样。

回答by Manjunath Anand

Below is the hashCode computation of a String which a JVM does. As stated it purely calculates based on the individual character and its position in the String and there is nothing which is dependent on JVM or the machine type which runs the JVM which would alter the hashcode.

下面是 JVM 执行的 String 的 hashCode 计算。如前所述,它纯粹根据单个字符及其在字符串中的位置进行计算,并且没有任何依赖于 JVM 或运行 JVM 的机器类型会改变哈希码。

This is also one of the reason why String class is declared final (not extensible leading to immutability) so that no one alters its behaviour.

这也是 String 类被声明为 final(不可扩展导致不可变性)的原因之一,以便没有人改变其行为。

Below is as per spec:-

以下是根据规格:-

public int hashCode()

Returns a hash code for this string. The hash code for a String object is computed as

返回此字符串的哈希码。String 对象的哈希码计算如下

s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]

using int arithmetic, where s[i]is the ith character of the string, n is the length of the string, and ^ indicates exponentiation. (The hash value of the empty string is zero.)

使用int算术,其中s[i]是字符串的第i个字符,n是字符串的长度,^表示取幂。(空字符串的哈希值为零。)

回答by Abhishek Gharai

No,

不,

Because a string in java can have maximum 2,147,483,647 (2^31 - 1) no of characters and all characters will vary so it will produce a very large no of combinations, but integer have only a range from -2,147,483,648 to 2,147,483,648. So it is impossible, and using this method the hash code of a string is computed

因为 java 中的字符串最多可以有 2,147,483,647 (2^31 - 1) 个字符,并且所有字符都会有所不同,因此它会产生非常大的组合数,但整数的范围只有 -2,147,483,648 到 2,147,483,648。所以这是不可能的,使用这种方法计算字符串的哈希码

s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1].

s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]。

Example :

例子 :

If you create two string variables as "FB" and "Ea" there hash code will be same.

如果您创建两个字符串变量为“FB”和“Ea”,则哈希码将相同。