C++ 使用自定义哈希函数插入 unordered_set

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15869066/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 19:50:31  来源:igfitidea点击:

Inserting into an unordered_set with custom hash function

c++c++11unordered-set

提问by user2052561

I have the following code to make an unordered_set<Interval>. This compiles fine.

我有以下代码来制作unordered_set<Interval>. 这编译得很好。

struct Interval {
  unsigned int begin;
  unsigned int end;
  bool updated;   //true if concat.  initially false
  int patternIndex;  //pattern index. valid for single pattern
  int proteinIndex;   //protein index.  for retrieving the pattern
};

struct Hash {
  size_t operator()(const Interval &interval);
};

size_t Hash::operator()(const Interval &interval){
  string temp = to_string(interval.begin) + to_string(interval.end) + to_string(interval.proteinIndex);
  return hash<string>()(temp);
}

unordered_set<Interval, string, Hash> test;

However, I cannot compile when I try to insert using this code:

但是,当我尝试使用以下代码插入时无法编译:

for(list<Interval>::iterator i = concat.begin(); i != concat.end(); ++i){
  test.insert((*i));
}

Moreover, I cannot determine what the problem is from the error messages, for example:

此外,我无法从错误消息中确定问题出在哪里,例如:

note: candidate is:
note: size_t Hash::operator()(const Interval&)
note:   candidate expects 1 argument, 2 provided  

I thought I only provided 1 argument...

我以为我只提供了 1 个参数...

What is the problem with my insertion code?

我的插入代码有什么问题?



Here's the new instantiation code: unordered_set<Interval, Hash> test;However, I'm still receiving a slew of error messages, for example:

这是新的实例化代码:unordered_set<Interval, Hash> test;但是,我仍然收到大量错误消息,例如:

note: candidate is:
note: size_t Hash::operator()(const Interval&) <near match>
note:   no known conversion for implicit ‘this' parameter from ‘const Hash*' to ‘Hash*'

回答by Andy Prowl

First problem:

第一个问题:

You are passing stringas the second template argument for your instantiation of the unordered_set<>class template. The second argument should be the type of your hasher functor, and std::stringis not a callable object.

您将string作为unordered_set<>类模板实例化的第二个模板参数传递。第二个参数应该是你的 hasher functor 的类型,而std::string不是一个可调用的对象。

Perhaps meant to write:

也许是想写:

unordered_set<Interval, /* string */ Hash> test;
//                      ^^^^^^^^^^^^
//                      Why this?

Also, I would suggest using names other than beginand endfor your (member) variables, since those are names of algorithms of the C++ Standard Library.

此外,我会建议使用比其他的名字beginend你(会员)变量,这是因为它们的C ++标准库的算法名称。

Second problem:

第二个问题:

You should keep in mind, that the hasher function should be qualified as const, so your functor should be:

你应该记住,散列函数应该被限定为const,所以你的函子应该是:

struct Hash {
   size_t operator() (const Interval &interval) const {
   //                                           ^^^^^
   //                                           Don't forget this!
     string temp = to_string(interval.b) + 
                   to_string(interval.e) + 
                   to_string(interval.proteinIndex);
     return (temp.length());
   }
};

Third problem:

第三个问题:

Finally, if you want std::unordered_setto be able to work with objects of type Interval, you need to define an equality operator consistent with your hash function. By default, if you do not specify any type argument as the third parameter of the std::unordered_setclass template, operator ==will be used.

最后,如果您希望std::unordered_set能够处理类型为 的对象Interval,您需要定义一个与您的哈希函数一致的相等运算符。默认情况下,如果没有指定任何类型参数作为std::unordered_set类模板的第三个参数 ,operator ==将使用。

You currently do not have any overload of operator ==for your class Interval, so you should provide one. For example:

您当前operator ==的 class没有任何重载Interval,因此您应该提供一个。例如:

inline bool operator == (Interval const& lhs, Interval const& rhs)
{
    return (lhs.b == rhs.b) && 
           (lhs.e == rhs.e) && 
           (lhs.proteinIndex == rhs.proteinIndex); 
}

Conclusion:

结论:

After all the above modifications, you can see your code compiling in this live example.

完成上述所有修改后,您可以在此实时示例中看到您的代码正在编译。

回答by honk

I think, Andy Prowl perfectly fixed the problemswith your code. However, I would add the following member function to your Interval, which describes what makes two intervals identical:

我认为,Andy Prowl 完美地解决了您的代码问题。但是,我会将以下成员函数添加到您的Interval,它描述了使两个间隔相同的原因:

std::string getID() const { return std::to_string(b) + " " + std::to_string(e) + " " + std::to_string(proteinIndex); }

Please note, that I also followed Andy Prowl's suggestion and renamed the members beginto band endto e. Next, you can easily define the hash and comparison functions by using lambda expressions. As a result, you can define your unordered_setas follows:

请注意,我也跟着安迪四处寻觅的建议,并改名为成员begin,以bende。接下来,您可以使用lambda 表达式轻松定义散列和比较函数。因此,您可以定义unordered_set如下:

auto hash = [](const Interval& i){ return std::hash<std::string>()(i.getID()); };
auto equal = [](const Interval& i1, const Interval& i2){ return i1.getID() == i2.getID(); };
std::unordered_set<Interval, decltype(hash), decltype(equal)> test(8, hash, equal);

Finally, for reasons of readability, I converted your forloop into a range-based forloop:

最后,出于可读性的原因,我将您的for循环转换为基于范围的for循环:

std::list<Interval> concat {{1, 2, false, 3, 4}, {2, 3, false, 4, 5}, {1, 2, true, 7, 4}};

for (auto const &i : concat)
    test.insert(i);

for (auto const &i : test)
    std::cout << i.b << ", " << i.e << ", " << i.updated << std::endl;

Output (I just printed first three members of each Interval):

输出(我只打印了每个的前三个成员Interval):

2, 3, 0
1, 2, 0

2, 3, 0
1, 2, 0

As you can see, there are only two intervals printed. The third one ({1, 2, true, 7, 4}) was not inserted to concat, because its b, e, and proteinIndexare equal to that of the first interval ({1, 2, false, 3, 4}).

如您所见,只打印了两个间隔。第三个 ( {1, 2, true, 7, 4}) 没有插入到 中concat,因为它的be、 和proteinIndex等于第一个间隔 ( {1, 2, false, 3, 4}) 的。

Code on Ideone

Ideone 上的代码