如何使用 C++ 将字符串散列到 int?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2535284/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 23:50:42  来源:igfitidea点击:

How can I hash a string to an int using c++?

c++hashstringcstring

提问by zebraman

I have to write my own hash function. If I wanted to just make the simple hash function that maps each letter in the string to a numerical value (i.e. a=1, b=2, c=3, ...), is there a way I can perform this hash on a string without having to first convert it to a c-string to look at each individual char? Is there a more efficient way of hashing strings?

我必须编写自己的哈希函数。如果我只想制作将字符串中的每个字母映射到数值(即 a=1, b=2, c=3, ...)的简单散列函数,有没有一种方法可以在一个字符串而不必先将其转换为 c 字符串来查看每个单独的字符?有没有更有效的方法来散列字符串?

采纳答案by Alex Martelli

Re the first question, sure, e.g, something like:

关于第一个问题,当然,例如,类似于:

int hash = 0;
int offset = 'a' - 1;
for(string::const_iterator it=s.begin(); it!=s.end(); ++it) {
  hash = hash << 1 | (*it - offset);
}

regarding the second, there are many better ways to hash strings. E.g., see herefor a few C examples (easily translatable to C++ along the lines of the snippet above).

关于第二个,有很多更好的方法来散列字符串。例如,请参阅此处了解一些 C 示例(可以按照上面的代码段轻松翻译为 C++)。

回答by Tim Cooper

From personal experience I know that this works and produces good distributions. (Plagiarised from http://www.cse.yorku.ca/~oz/hash.html):

根据个人经验,我知道这有效并产生了良好的发行版。(抄袭自http://www.cse.yorku.ca/~oz/hash.html):

djb2

djb2

this algorithm (k=33) was first reported by dan bernstein many years ago in comp.lang.c. another version of this algorithm (now favored by bernstein) uses xor: hash(i) = hash(i - 1) * 33 ^ str[i]; the magic of number 33 (why it works better than many other constants, prime or not) has never been adequately explained.

该算法 (k=33) 多年前由 dan Bernstein 在 comp.lang.c 中首次报道。该算法的另一个版本(现在受到伯恩斯坦青睐)使用异或: hash(i) = hash(i - 1) * 33 ^ str[i]; 数字 33 的魔力(为什么它比许多其他常数更有效,无论质数与否)从未得到充分解释。

unsigned long hash(unsigned char *str) {
    unsigned long hash = 5381;
    int c;

    while (c = *str++) {
        hash = ((hash << 5) + hash) + c; /* hash * 33 + c */
    }

    return hash;
}

回答by wheaties

You can examine each individual char from a std::string using the []operator. However, you can look at Boost::Functional/Hashfor guidance on a better hashing scheme. There is also a list of hashing functions in c located here.

您可以使用[]运算符检查 std::string 中的每个单独的字符。但是,您可以查看Boost::Functional/Hash以获得关于更好的散列方案的指导。此处还有一个 c 中的散列函数列表。

回答by Wren

Here's a C (++) hash function that I found in Stroustrup's book:

这是我在 Stroustrup 的书中找到的 C (++) 哈希函数:

int hash(const char *str)
{
    int h = 0;
    while (*str)
       h = h << 1 ^ *str++;
    return h;
}

If you're using it for a hash table (which Stroustrup does) then you can instead return the abs of the hash modulo a prime number. So instead

如果您将它用于哈希表(Stroustrup 这样做),那么您可以改为返回哈希模的 abs 一个素数。所以与其

    return (h > 0 ? h : -h) % N_BUCKETS;

for the last line.

对于最后一行。

回答by alfC

C++11 ships with a standard hashing function for strings.

C++11 附带了一个标准的字符串散列函数。

https://en.cppreference.com/w/cpp/string/basic_string/hash

https://en.cppreference.com/w/cpp/string/basic_string/hash

#include <string>
#include<functional> // hash
int main(){
    std::string s = "Hello";
    std::size_t hash = std::hash<std::string>{}(s);
}

回答by Stephen

xor the characters together, four at a time.

将字符异或在一起,一次四个。

回答by LUCAS

Another way for small strings:

小字符串的另一种方法:

int hash(const char* str) {
    int hash = 0;
    int c = 0;

    while (c < std::strlen(str)) {
        hash += (int)str[c] << (int)str[c+1];
        c++;
    }
    return hash;
}

回答by wilhelmtell

#include <iostream>
#include <string>
#include <algorithm>

using namespace std;

// a variation on dan bernstein's algorithm
// [http://www.cse.yorku.ca/~oz/hash.html]
template<typename Int>
struct hash {
    hash() : acc(5381) { }
    template<typename Ch>
    void operator()(Ch ch) { acc = ((acc << 5) + acc) ^ ch; }
    operator Int() const { return acc; }
    Int acc;
};

int main(int argc, char* argv[])
{
    string s("Hellp, world");
    cout << hex << showbase
        << for_each(s.begin(), s.end(), hash<unsigned long long>()) << '\n';
    return 0;
}

回答by codaddict

You can make use of the member functions operator[]or atof the string class or iterators to access individual char of a string object without converting it to c-style char array.

您可以使用成员函数operator[]或字符串类或迭代器的at来访问字符串对象的单个字符,而无需将其转换为 c 样式的字符数组。

To hash a string object to an integer you'll have to access each individual char of the string object which you can do as:

要将字符串对象散列为整数,您必须访问字符串对象的每个单独字符,您可以这样做:

for (i=0; i < str.length(); i++) {
    // use str[i] or str.at(i) to access ith element.
}