C# 为什么在重写 Equals 方法时重写 GetHashCode 很重要?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/371328/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-04 00:54:21  来源:igfitidea点击:

Why is it important to override GetHashCode when Equals method is overridden?

c#overridinghashcode

提问by David Basarab

Given the following class

鉴于以下类

public class Foo
{
    public int FooId { get; set; }
    public string FooName { get; set; }

    public override bool Equals(object obj)
    {
        Foo fooItem = obj as Foo;

        if (fooItem == null) 
        {
           return false;
        }

        return fooItem.FooId == this.FooId;
    }

    public override int GetHashCode()
    {
        // Which is preferred?

        return base.GetHashCode();

        //return this.FooId.GetHashCode();
    }
}

I have overridden the Equalsmethod because Foorepresent a row for the Foos table. Which is the preferred method for overriding the GetHashCode?

我已经覆盖了该Equals方法,因为Foo代表Foos 表的一行。哪个是覆盖的首选方法GetHashCode

Why is it important to override GetHashCode?

为什么覆盖很重要GetHashCode

采纳答案by Marc Gravell

Yes, it is important if your item will be used as a key in a dictionary, or HashSet<T>, etc - since this is used (in the absence of a custom IEqualityComparer<T>) to group items into buckets. If the hash-code for two items does not match, they may neverbe considered equal (Equalswill simply never be called).

是的,如果您的项目将用作字典或HashSet<T>等中的键,这一点很重要- 因为这用于(在没有自定义的情况下IEqualityComparer<T>)将项目分组到存储桶中。如果两个项目的哈希码不匹配,它们可能永远不会被认为是相等的(Equals永远不会被调用)。

The GetHashCode()method should reflect the Equalslogic; the rules are:

所述的GetHashCode()方法应该反映Equals逻辑; 规则是:

  • if two things are equal (Equals(...) == true) then they mustreturn the same value for GetHashCode()
  • if the GetHashCode()is equal, it is notnecessary for them to be the same; this is a collision, and Equalswill be called to see if it is a real equality or not.
  • 如果两个东西相等 ( Equals(...) == true) 那么它们必须返回相同的值GetHashCode()
  • 如果GetHashCode()是相等的,它是必要对他们是相同的; 这是一个碰撞,Equals将被调用以查看它是否是真正的相等。

In this case, it looks like "return FooId;" is a suitable GetHashCode()implementation. If you are testing multiple properties, it is common to combine them using code like below, to reduce diagonal collisions (i.e. so that new Foo(3,5)has a different hash-code to new Foo(5,3)):

在这种情况下,看起来“ return FooId;”是一个合适的GetHashCode()实现。如果您正在测试多个属性,通常使用如下代码将它们组合起来,以减少对角线冲突(即new Foo(3,5)具有与 不同的哈希码new Foo(5,3)):

unchecked // only needed if you're compiling with arithmetic checks enabled
{ // (the default compiler behaviour is *disabled*, so most folks won't need this)
    int hash = 13;
    hash = (hash * 7) + field1.GetHashCode();
    hash = (hash * 7) + field2.GetHashCode();
    ...
    return hash;
}

Oh - for convenience, you might also consider providing ==and !=operators when overriding Equalsand GetHashCode.

哦 - 为了方便起见,您还可以考虑在覆盖and时提供==and!=运算符。EqualsGetHashCode



A demonstration of what happens when you get this wrong is here.

此处演示了当您出错时会发生什么。

回答by kemiller2002

It is because the framework requires that two objects that are the same must have the same hashcode. If you override the equals method to do a special comparison of two objects and the two objects are considered the same by the method, then the hash code of the two objects must also be the same. (Dictionaries and Hashtables rely on this principle).

这是因为框架要求两个相同的对象必须具有相同的哈希码。如果重写equals方法对两个对象进行特殊比较,并且方法认为这两个对象相同,那么两个对象的哈希码也必须相同。(字典和哈希表依赖于这个原则)。

回答by Trap

By overriding Equals you're basically stating that you are the one who knows better how to compare two instances of a given type, so you're likely to be the best candidate to provide the best hash code.

通过覆盖 Equals,您基本上是在说明您是最了解如何比较给定类型的两个实例的人,因此您很可能是提供最佳哈希码的最佳人选。

This is an example of how ReSharper writes a GetHashCode() function for you:

这是 ReSharper 如何为您编写 GetHashCode() 函数的示例:

public override int GetHashCode()
{
    unchecked
    {
        var result = 0;
        result = (result * 397) ^ m_someVar1;
        result = (result * 397) ^ m_someVar2;
        result = (result * 397) ^ m_someVar3;
        result = (result * 397) ^ m_someVar4;
        return result;
    }
}

As you can see it just tries to guess a good hash code based on all the fields in the class, but since you know your object's domain or value ranges you could still provide a better one.

正如您所看到的,它只是尝试根据类中的所有字段猜测一个好的哈希码,但是由于您知道对象的域或值范围,您仍然可以提供更好的哈希码。

回答by Albic

It's actually very hard to implement GetHashCode()correctly because, in addition to the rules Marc already mentioned, the hash code should not change during the lifetime of an object. Therefore the fields which are used to calculate the hash code must be immutable.

实际上很难GetHashCode()正确实现,因为除了 Marc 已经提到的规则之外,哈希码在对象的生命周期内不应更改。因此用于计算哈希码的字段必须是不可变的。

I finally found a solution to this problem when I was working with NHibernate. My approach is to calculate the hash code from the ID of the object. The ID can only be set though the constructor so if you want to change the ID, which is very unlikely, you have to create a new object which has a new ID and therefore a new hash code. This approach works best with GUIDs because you can provide a parameterless constructor which randomly generates an ID.

我在使用 NHibernate 时终于找到了解决这个问题的方法。我的方法是根据对象的 ID 计算哈希码。ID 只能通过构造函数设置,因此如果您想更改 ID(这是不太可能的),您必须创建一个具有新 ID 和新哈希码的新对象。这种方法最适用于 GUID,因为您可以提供一个随机生成 ID 的无参数构造函数。

回答by Ludmil Tinkov

How about:

怎么样:

public override int GetHashCode()
{
    return string.Format("{0}_{1}_{2}", prop1, prop2, prop3).GetHashCode();
}

Assuming performance is not an issue :)

假设性能不是问题:)

回答by ILoveFortran

It's not necessarily important; it depends on the size of your collections and your performance requirements and whether your class will be used in a library where you may not know the performance requirements. I frequently know my collection sizes are not very large and my time is more valuable than a few microseconds of performance gained by creating a perfect hash code; so (to get rid of the annoying warning by the compiler) I simply use:

这不一定很重要;这取决于您的集合的大小和您的性能要求,以及您的类是否将用于您可能不知道性能要求的库中。我经常知道我的集合规模不是很大,我的时间比通过创建完美的哈希码获得的几微秒的性能更有价值;所以(为了摆脱编译器烦人的警告)我只是使用:

   public override int GetHashCode()
   {
      return base.GetHashCode();
   }

(Of course I could use a #pragma to turn off the warning as well but I prefer this way.)

(当然,我也可以使用 #pragma 来关闭警告,但我更喜欢这种方式。)

When you are in the position that you doneed the performance than all of the issues mentioned by others here apply, of course. Most important- otherwise you will get wrong results when retrieving items from a hash set or dictionary: the hash code must not vary with the life time of an object(more accurately, during the time whenever the hash code is needed, such as while being a key in a dictionary): for example, the following is wrong as Value is public and so can be changed externally to the class during the life time of the instance, so you must not use it as the basis for the hash code:

当然,当您处于确实需要性能的位置时,这里其他人提到的所有问题都适用。最重要的是- 否则从哈希集或字典中检索项目时会得到错误的结果:哈希码不得随对象的生命周期而变化(更准确地说,在需要哈希码的时间段内,例如字典中的一个键):例如,以下是错误的,因为 Value 是公共的,因此可以在实例的生命周期内从外部更改到类,因此您不得将其用作哈希码的基础:


   class A
   {
      public int Value;

      public override int GetHashCode()
      {
         return Value.GetHashCode(); //WRONG! Value is not constant during the instance's life time
      }
   }    

On the other hand, if Value can't be changed it's ok to use:

另一方面,如果 Value 无法更改,则可以使用:


   class A
   {
      public readonly int Value;

      public override int GetHashCode()
      {
         return Value.GetHashCode(); //OK  Value is read-only and can't be changed during the instance's life time
      }
   }

回答by huha

Please don′t forget to check the obj parameter against nullwhen overriding Equals(). And also compare the type.

请不要忘记null在覆盖时检查 obj 参数Equals()。并比较类型。

public override bool Equals(object obj)
{
    Foo fooItem = obj as Foo;

    if (fooItem == null)
    {
       return false;
    }

    return fooItem.FooId == this.FooId;
}

The reason for this is: Equalsmust return false on comparison to null. See also http://msdn.microsoft.com/en-us/library/bsc2ak47.aspx

这样做的原因是:Equals必须在与 比较时返回 false null。另见http://msdn.microsoft.com/en-us/library/bsc2ak47.aspx

回答by Maciej

Hash code is used for hash-based collections like Dictionary, Hashtable, HashSet etc. The purpose of this code is to very quickly pre-sort specific object by putting it into specific group (bucket). This pre-sorting helps tremendously in finding this object when you need to retrieve it back from hash-collection because code has to search for your object in just one bucket instead of in all objects it contains. The better distribution of hash codes (better uniqueness) the faster retrieval. In ideal situation where each object has a unique hash code, finding it is an O(1) operation. In most cases it approaches O(1).

哈希码用于基于哈希的集合,如 Dictionary、Hashtable、HashSet 等。此代码的目的是通过将特定对象放入特定组(存储桶)来非常快速地预先排序特定对象。当您需要从散列集合中检索该对象时,这种预排序非常有助于找到该对象,因为代码必须仅在一个存储桶中搜索您的对象,而不是在它包含的所有对象中搜索。哈希码分布越好(唯一性越好),检索速度越快。在每个对象都有唯一哈希码的理想情况下,找到它是一个 O(1) 操作。在大多数情况下,它接近 O(1)。

回答by user2855602

It's my understanding that the original GetHashCode() returns the memory address of the object, so it's essential to override it if you wish to compare two different objects.

我的理解是原始 GetHashCode() 返回对象的内存地址,因此如果您希望比较两个不同的对象,则必须覆盖它。

EDITED: That was incorrect, the original GetHashCode() method cannot assure the equality of 2 values. Though objects that are equal return the same hash code.

编辑:这是不正确的,原始的 GetHashCode() 方法不能保证 2 个值的相等性。尽管相等的对象返回相同的哈希码。

回答by Ian Ringrose

We have two problems to cope with.

我们有两个问题需要解决。

  1. You cannot provide a sensible GetHashCode()if any field in the object can be changed. Also often a object will NEVER be used in a collection that depends on GetHashCode(). So the cost of implementing GetHashCode()is often not worth it, or it is not possible.

  2. If someone puts your object in a collection that calls GetHashCode()and you have overrided Equals()without also making GetHashCode()behave in a correct way, that person may spend days tracking down the problem.

  1. GetHashCode()如果可以更改对象中的任何字段,则无法提供合理的信息。通常,对象永远不会用于依赖于 GetHashCode(). 所以实施的成本GetHashCode()往往是不值得的,或者是不可能的。

  2. 如果有人将您的对象放入一个调用的集合中, GetHashCode()并且您在Equals()没有GetHashCode()以正确的方式进行操作的情况下进行了 覆盖,那么该人可能会花费数天时间来跟踪问题。

Therefore by default I do.

因此,默认情况下我会这样做。

public class Foo
{
    public int FooId { get; set; }
    public string FooName { get; set; }

    public override bool Equals(object obj)
    {
        Foo fooItem = obj as Foo;

        if (fooItem == null)
        {
           return false;
        }

        return fooItem.FooId == this.FooId;
    }

    public override int GetHashCode()
    {
        // Some comment to explain if there is a real problem with providing GetHashCode() 
        // or if I just don't see a need for it for the given class
        throw new Exception("Sorry I don't know what GetHashCode should do for this class");
    }
}