C# 包含通用数组的对象的 GetHashCode 覆盖
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/638761/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
GetHashCode override of object containing generic array
提问by Svish
I have a class that contains the following two properties:
我有一个包含以下两个属性的类:
public int Id { get; private set; }
public T[] Values { get; private set; }
I have made it IEquatable<T>
and overriden the object.Equals
like this:
我已经做到了IEquatable<T>
并覆盖了object.Equals
这样的:
public override bool Equals(object obj)
{
return Equals(obj as SimpleTableRow<T>);
}
public bool Equals(SimpleTableRow<T> other)
{
// Check for null
if(ReferenceEquals(other, null))
return false;
// Check for same reference
if(ReferenceEquals(this, other))
return true;
// Check for same Id and same Values
return Id == other.Id && Values.SequenceEqual(other.Values);
}
When having override object.Equals
I must also override GetHashCode
of course. But what code should I implement? How do I create a hashcode out of a generic array? And how do I combine it with the Id
integer?
当有覆盖时,object.Equals
我GetHashCode
当然也必须覆盖。但是我应该实现什么代码?如何从通用数组中创建哈希码?我如何将它与Id
整数结合起来?
public override int GetHashCode()
{
return // What?
}
采纳答案by Marc Gravell
Because of the problems raised in this thread, I'm posting another reply showing what happens if you get it wrong... mainly, that you can't use the array's GetHashCode()
; the correct behaviour is that no warnings are printed when you run it... switch the comments to fix it:
由于这个线程中提出的问题,我发布了另一个回复,显示如果你弄错了会发生什么......主要是,你不能使用数组的GetHashCode()
; 正确的行为是运行它时不会打印警告...切换注释以修复它:
using System;
using System.Collections.Generic;
using System.Linq;
static class Program
{
static void Main()
{
// first and second are logically equivalent
SimpleTableRow<int> first = new SimpleTableRow<int>(1, 2, 3, 4, 5, 6),
second = new SimpleTableRow<int>(1, 2, 3, 4, 5, 6);
if (first.Equals(second) && first.GetHashCode() != second.GetHashCode())
{ // proven Equals, but GetHashCode() disagrees
Console.WriteLine("We have a problem");
}
HashSet<SimpleTableRow<int>> set = new HashSet<SimpleTableRow<int>>();
set.Add(first);
set.Add(second);
// which confuses anything that uses hash algorithms
if (set.Count != 1) Console.WriteLine("Yup, very bad indeed");
}
}
class SimpleTableRow<T> : IEquatable<SimpleTableRow<T>>
{
public SimpleTableRow(int id, params T[] values) {
this.Id = id;
this.Values = values;
}
public int Id { get; private set; }
public T[] Values { get; private set; }
public override int GetHashCode() // wrong
{
return Id.GetHashCode() ^ Values.GetHashCode();
}
/*
public override int GetHashCode() // right
{
int hash = Id;
if (Values != null)
{
hash = (hash * 17) + Values.Length;
foreach (T t in Values)
{
hash *= 17;
if (t != null) hash = hash + t.GetHashCode();
}
}
return hash;
}
*/
public override bool Equals(object obj)
{
return Equals(obj as SimpleTableRow<T>);
}
public bool Equals(SimpleTableRow<T> other)
{
// Check for null
if (ReferenceEquals(other, null))
return false;
// Check for same reference
if (ReferenceEquals(this, other))
return true;
// Check for same Id and same Values
return Id == other.Id && Values.SequenceEqual(other.Values);
}
}
回答by John Saunders
public override int GetHashCode() {
return Id.GetHashCode() ^ Values.GetHashCode();
}
There are several good points in the comments and other answers. The OP should consider whether the Values would be used as part of the "key" if the object were used as a key in a dictionary. If so, then they should be part of the hash code, otherwise, not.
评论和其他答案中有几个优点。如果对象被用作字典中的键,OP 应该考虑是否将值用作“键”的一部分。如果是这样,那么它们应该是哈希码的一部分,否则不是。
On the other hand, I'm not sure why the GetHashCode method should mirror SequenceEqual. It's meant to compute an index into a hash table, not to be the complete determinant of equality. If there are many hash table collisions using the algorithm above, and if they differ in the sequence of the Values, then an algorithm should be chosen that takes sequence into account. If sequence doesn't really matter, save the time and don't take it into account.
另一方面,我不确定为什么 GetHashCode 方法应该反映 SequenceEqual。它的目的是计算哈希表的索引,而不是完全相等的决定因素。如果使用上述算法存在很多哈希表冲突,并且它们的值的顺序不同,则应选择考虑顺序的算法。如果顺序并不重要,请节省时间并且不要考虑它。
回答by Grzenio
I would do it this way:
我会这样做:
long result = Id.GetHashCode();
foreach(T val in Values)
result ^= val.GetHashCode();
return result;
回答by Dustin Campbell
Provided that Id and Values will never change, and Values is not null...
只要 Id 和 Values 永远不会改变,并且 Values 不为空......
public override int GetHashCode()
{
return Id ^ Values.GetHashCode();
}
Note that your class is not immutable, since anyone can modify the contents of Values because it is an array. Given that, I wouldn't try to generate a hashcode using its contents.
请注意,您的类不是不可变的,因为任何人都可以修改 Values 的内容,因为它是一个数组。鉴于此,我不会尝试使用其内容生成哈希码。
回答by Marc Gravell
How about something like:
怎么样:
public override int GetHashCode()
{
int hash = Id;
if (Values != null)
{
hash = (hash * 17) + Values.Length;
foreach (T t in Values)
{
hash *= 17;
if (t != null) hash = hash + t.GetHashCode();
}
}
return hash;
}
This should be compatible with SequenceEqual
, rather than doing a reference comparison on the array.
这应该与 兼容SequenceEqual
,而不是对数组进行引用比较。
回答by Dustin Campbell
FWIW, it's very dangerous to use the contents of the Values in your hash code. You should only do this if you can guarantee that it will never change. However, since it is exposed, I don't think guaranteeing it is possible. The hashcode of an object should never change. Otherwise, it loses its value as a key in a Hashtable or Dictionary. Consider the hard-to-find bug of using an object as a key in a Hashtable, its hashcode changes because of an outside influence and you can no longer find it in the Hashtable!
FWIW,在哈希码中使用值的内容非常危险。只有在您可以保证它永远不会改变的情况下才应该这样做。但是,由于它是暴露的,我认为不能保证它是可能的。对象的哈希码永远不应该改变。否则,它将失去其作为 Hashtable 或 Dictionary 中键的值。考虑使用对象作为哈希表中的键的难以发现的错误,它的哈希码因外部影响而发生变化,您再也无法在哈希表中找到它!
回答by Jhonny D. Cano -Leftware-
Since the hashCode is kinda a key for storing the object (lllike in a hashtable), i would use just Id.GetHashCode()
由于 hashCode 是存储对象的键(就像在哈希表中一样),我将只使用 Id.GetHashCode()
回答by D. Patrick
I know this thread is pretty old, but I wrote this method to allow me to calculate hashcodes of multiple objects. It's been very helpful for this very case. It's not perfect, but it does meet my needs and most likely yours too.
我知道这个线程已经很老了,但我写了这个方法来让我计算多个对象的哈希码。这对这种情况非常有帮助。它并不完美,但它确实满足了我的需求,很可能也满足了你的需求。
I can't really take any credit for it. I got the concept from some of the .net gethashcode implementations. I'm using 419 (afterall, it's my favorite large prime), but you can choose just about any reasonable prime (not too small . . . not too large).
我真的不能相信它。我从一些 .net gethashcode 实现中得到了这个概念。我使用的是 419(毕竟,它是我最喜欢的大素数),但您可以选择任何合理的素数(不要太小……不要太大)。
So, here's how I get my hashcodes:
所以,这是我获取哈希码的方法:
using System.Collections.Generic;
using System.Linq;
public static class HashCodeCalculator
{
public static int CalculateHashCode(params object[] args)
{
return args.CalculateHashCode();
}
public static int CalculateHashCode(this IEnumerable<object> args)
{
if (args == null)
return new object().GetHashCode();
unchecked
{
return args.Aggregate(0, (current, next) => (current*419) ^ (next ?? new object()).GetHashCode());
}
}
}
回答by Allon Guralnek
I just had to add another answer because one of the more obvious (and easiest to implement) solutions were not mentioned - not including the collection in your GetHashCode
calculation!
我只需要添加另一个答案,因为没有提到一个更明显(也是最容易实现)的解决方案 - 不包括在您的GetHashCode
计算中的集合!
The main thing that seemed to have forgotten here is that the uniqueness from the result of GetHashCode
isn't required (or in many cases even possible). Unequal objects don't have to return unequal hash codes, the only requirement is that equal objects return equal hash codes. So by that definition, the following implementation of GetHashCode
is correct for all objects (assuming there's a correct Equals
implementation):
这里似乎忘记的主要事情GetHashCode
是不需要(或在许多情况下甚至可能)结果的唯一性。不相等的对象不必返回不相等的哈希码,唯一的要求是相等的对象返回相等的哈希码。因此,根据该定义,以下实现GetHashCode
对于所有对象都是正确的(假设有正确的Equals
实现):
public override int GetHashCode()
{
return 42;
}
Of course this would yield the worst possible performance in hashtable lookup, O(n) instead of O(1), but it is still functionally correct.
当然,这会在哈希表查找中产生最差的性能,O(n) 而不是 O(1),但它在功能上仍然是正确的。
With that in mind, my general recommendation when implementing GetHashCode
for an object that happens to have any kind of collection as one or more of its members is to simply ignore them and calculate GetHashCode
solely based on the other scalar members. This would work pretty well except if you put into a hash table a huge number of objects where all their scalar members have identical values, resulting in identical hash codes.
考虑到这一点,在GetHashCode
为碰巧具有任何类型集合作为其一个或多个成员的对象实施时,我的一般建议是简单地忽略它们并GetHashCode
仅基于其他标量成员进行计算。这将非常有效,除非您将大量对象放入哈希表,其中所有标量成员都具有相同的值,从而产生相同的哈希码。
Ignoring collection members when calculating the hash code can also yield a performance improvement, despite the decreased distribution of the hash code values. Remember that using a hash code is supposed to improve performance in a hash table by not requiring to call Equals
N times, and instead will only require calling GetHashCode once and a quick hash table lookup. If each object has an inner array with 10,000 items which all participate in the calculation of the hash code, any benefits gained by the good distribution would probably be lost. It would be better to have a marginally less distributed hash code if generating it is considerably less costly.
尽管散列码值的分布减少了,但在计算散列码时忽略集合成员也可以提高性能。请记住,通过不需要调用Equals
N 次,使用哈希码应该可以提高哈希表中的性能,而只需要调用一次 GetHashCode 和快速哈希表查找。如果每个对象都有一个包含 10,000 个项目的内部数组,这些项目都参与了哈希码的计算,那么良好分布所获得的任何好处都可能会丢失。如果生成散列码的成本要低得多,那么最好有一个稍微分布较少的散列码。