Java 如何手动计算字符串的哈希码?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3795400/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-14 05:04:05  来源:igfitidea点击:

How to calculate the hash code of a string by hand?

javahash

提问by thomascirca

I was wondering how to calculate the hash code for a given string by hand. I understand that in Java, you can do something like:

我想知道如何手动计算给定字符串的哈希码。我知道在 Java 中,您可以执行以下操作:

String me = "What you say what you say what?";  
long whatever = me.hashCode();

That's all good and dandy, but I was wondering how to do it by hand. I know the given formula for calculating the hash code of a string is something like:

这一切都很好,但我想知道如何手动完成。我知道用于计算字符串哈希码的给定公式类似于:

S0 X 31 ^ (n-1) + S1 X 31 ^ (n-2) + .... + S(n-2) X 31 + S(n-1)  

Where S indicates the character in the string, and n is the length of the string. Using 16 bit unicode then, the first character from string me would be computed as:

其中 S 表示字符串中的字符,n 是字符串的长度。然后使用 16 位 unicode,字符串 me 中的第一个字符将计算为:

87 X (31 ^ 34)

However, that creates an insanely large number. I can't imagine adding all the characters together like that. So, in order to calculate the lowest-order 32 bits result then, what would I do? Long whatever from above equals -957986661 and I'm not how to calculate that?

然而,这创造了一个非常大的数字。我无法想象像这样将所有角色加在一起。那么,为了计算最低阶 32 位的结果,我该怎么办?上面的长等于-957986661,我不知道如何计算?

采纳答案by dty

Take a look at the source code of java.lang.String.

看一下源代码java.lang.String

/**
 * Returns a hash code for this string. The hash code for a
 * <code>String</code> object is computed as
 * <blockquote><pre>
 * s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
 * </pre></blockquote>
 * using <code>int</code> arithmetic, where <code>s[i]</code> is the
 * <i>i</i>th character of the string, <code>n</code> is the length of
 * the string, and <code>^</code> indicates exponentiation.
 * (The hash value of the empty string is zero.)
 *
 * @return  a hash code value for this object.
 */
public int hashCode() {
    int h = hash;
    int len = count;
    if (h == 0 && len > 0) {
        int off = offset;
        char val[] = value;
        for (int i = 0; i < len; i++) {
            h = 31*h + val[off++];
        }
        hash = h;
    }
    return h;
}

回答by MAK

Most hash functions of this sort calculate the hash value modulosome large number (e.g. a large prime). This avoids overflows and keeps the range of values returned by the function within a specified range. But this also means an infinite range of input values will get a hash value from a finite set of possible values (i.e. [0,modulus)), hence the problem of hash collisions.

大多数此类散列函数以某个大数(例如,大素数)为来计算散列值。这避免了溢出并将函数返回的值范围保持在指定范围内。但这也意味着无限范围的输入值将从一组有限的可能值(即 [0,modulus)] 中获得一个哈希值,因此会出现哈希冲突的问题。

In this case, the code would look something like this:

在这种情况下,代码将如下所示:

   public int hash(String x){
        int hashcode=0;
        int MOD=10007;
        int shift=29;
        for(int i=0;i<x.length();i++){
            hashcode=((shift*hashcode)%MOD+x.charAt(i))%MOD;
        }
        return hashcode; 
    }

Exercise for the reader:

读者练习:

See the code for the hashCodefunction for java.util.String. Can you see why it does not use a modulus explicitly?

请参阅hashCodejava.util.String 函数的代码。你能明白为什么它没有明确使用模数吗?

回答by Vivek Singh

The following statements will find the string hashCode

以下语句将查找字符串 hashCode

String str="Hi";

int a = str.hashCode();//returns 2337

Let's check how exactly its calculated

让我们检查一下它的计算方式

HashCode = s[0]*31(n-1) + s[1]*31(n-2) + .. s(n-2)

哈希码 = s[0]*31(n-1) + s[1]*31(n-2) + .. s(n-2)

As we all know that the character at position 0 is H, Character at position 1 is i, and the string length is 2.

众所周知,0位字符为H,1位字符为i,字符串长度为2。

==> H*31(2-1) + i*31(2-2)

==> H*31(2-1) + i*31(2-2)

As we all know that, ASCII code of H is 72, and i is 105. It means,

众所周知,H的ASCII码是72,i是105。也就是说,

==> 72 * 31 + 105 * 1 (Anything Power 0 is 1)

==> 72 * 31 + 105 * 1(任何幂 0 为 1)

==> 2232 + 105 = 2337

==> 2232 + 105 = 2337

Source: https://www.tutorialgateway.org/find-string-hashcode-in-java/

来源:https: //www.tutorialgateway.org/find-string-hashcode-in-java/