C++ 一个很好的向量散列函数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20511347/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 23:11:02  来源:igfitidea点击:

A good hash function for a vector

c++c++11hash

提问by Martin Kristiansen

I have some vector of integer that I would like to store efficiently in a unordered_map in c++11 my question is this:

我有一些整数向量,我想将它们有效地存储在 c++11 中的 unordered_map 中,我的问题是:

How do I best store these and optimize for .findqueries?

我如何最好地存储这些并针对.find查询进行优化?

I came up with the following hasher:

我想出了以下哈希器:

class uint32_vector_hasher {
public:
  std::size_t operator()(std::vector<uint32_t> const& vec) const {
    std::size_t ret = 0;
    for(auto& i : vec) {
      ret ^= std::hash<uint32_t>()(i);
    }
    return ret;
  }
};

and then store the objects in an unordered_mapI do however have a couple of questions

然后将对象存储在一个unordered_map我确实有几个问题

  1. how often does the hash get calculated, only one, some random number or times?
  2. Would it make sense to create a wrapper object with ==and hash functions to make memorize the hash and avoid it being calculated more than once?
  1. 多久计算一次散列,只有一个,某个随机数或次数?
  2. 创建一个带有==散列函数的包装对象来记住散列并避免它被多次计算是否有意义?

When profiling I've noticed that a rather large amount of my cpu time is spend doing lookups on the unordered maps, this is not exactly optimal :(

在分析时,我注意到我的大量 CPU 时间花在了无序映射上的查找上,这并不是最佳的 :(

回答by HolKann

So, when not wanting to use boost, Michael Blurr's comment led to the following hash function implementation:

因此,当不想使用 boost 时,Michael Blurr 的评论导致了以下哈希函数实现:

std::size_t operator()(std::vector<uint32_t> const& vec) const {
  std::size_t seed = vec.size();
  for(auto& i : vec) {
    seed ^= i + 0x9e3779b9 + (seed << 6) + (seed >> 2);
  }
  return seed;
}

Seems to work.

似乎工作。