c# 如何计算对象的哈希码?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/102690/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How does c# figure out the hash code for an object?
提问by Max Galkin
This question comes out of the discussion on tuples.
这个问题来自关于tuples的讨论。
I started thinking about the hash code that a tuple should have. What if we will accept KeyValuePair class as a tuple? It doesn't override the GetHashCode() method, so probably it won't be aware of the hash codes of it's "children"... So, run-time will call Object.GetHashCode(), which is not aware of the real object structure.
我开始考虑元组应该具有的哈希码。如果我们接受 KeyValuePair 类作为元组会怎样?它不会覆盖 GetHashCode() 方法,所以它可能不会知道它的“子级”的哈希码......所以,运行时将调用 Object.GetHashCode(),它不知道真实的对象结构。
Then we can make two instances of some reference type, which are actually Equal, because of the overloaded GetHashCode() and Equals(). And use them as "children" in tuples to "cheat" the dictionary.
然后我们可以创建一些引用类型的两个实例,它们实际上是相等的,因为重载了 GetHashCode() 和 Equals()。并将它们用作元组中的“孩子”来“欺骗”字典。
But it doesn't work! Run-time somehow figures out the structure of our tuple and calls the overloaded GetHashCode of our class!
但它不起作用!运行时以某种方式计算出我们元组的结构并调用我们类的重载 GetHashCode!
How does it work? What's the analysis made by Object.GetHashCode()?
它是如何工作的?Object.GetHashCode() 的分析是什么?
Can it affect the performance in some bad scenario, when we use some complicated keys? (probably, impossible scenario... but still)
当我们使用一些复杂的密钥时,它会在一些糟糕的情况下影响性能吗?(可能,不可能的场景......但仍然)
Consider this code as an example:
以这段代码为例:
namespace csharp_tricks
{
class Program
{
class MyClass
{
int keyValue;
int someInfo;
public MyClass(int key, int info)
{
keyValue = key;
someInfo = info;
}
public override bool Equals(object obj)
{
MyClass other = obj as MyClass;
if (other == null) return false;
return keyValue.Equals(other.keyValue);
}
public override int GetHashCode()
{
return keyValue.GetHashCode();
}
}
static void Main(string[] args)
{
Dictionary<object, object> dict = new Dictionary<object, object>();
dict.Add(new KeyValuePair<MyClass,object>(new MyClass(1, 1), 1), 1);
//here we get the exception -- an item with the same key was already added
//but how did it figure out the hash code?
dict.Add(new KeyValuePair<MyClass,object>(new MyClass(1, 2), 1), 1);
return;
}
}
}
UpdateI think I've found an explanation for this as stated below in my answer. The main outcomes of it are:
更新我想我已经在下面的回答中找到了对此的解释。它的主要成果是:
- Be careful with your keys and their hash codes :-)
- For complicated dictionary keys you must override Equals() and GetHashCode() correctly.
- 小心你的密钥和它们的哈希码:-)
- 对于复杂的字典键,您必须正确覆盖 Equals() 和 GetHashCode()。
采纳答案by Max Galkin
It seems that I have a clue now.
我现在似乎有了线索。
I thought KeyValuePair is a reference type, but it is not, it is a struct. And so it uses ValueType.GetHashCode() method. MSDN for it says: "One or more fields of the derived type is used to calculate the return value".
我认为 KeyValuePair 是一种引用类型,但事实并非如此,它是一个结构体。所以它使用 ValueType.GetHashCode() 方法。MSDN 上说:“派生类型的一个或多个字段用于计算返回值”。
If you will take a real reference type as a "tuple-provider" you'll cheat the dictionary (or yourself...).
如果你将一个真正的引用类型作为“元组提供者”,你会欺骗字典(或你自己......)。
using System.Collections.Generic;
namespace csharp_tricks
{
class Program
{
class MyClass
{
int keyValue;
int someInfo;
public MyClass(int key, int info)
{
keyValue = key;
someInfo = info;
}
public override bool Equals(object obj)
{
MyClass other = obj as MyClass;
if (other == null) return false;
return keyValue.Equals(other.keyValue);
}
public override int GetHashCode()
{
return keyValue.GetHashCode();
}
}
class Pair<T, R>
{
public T First { get; set; }
public R Second { get; set; }
}
static void Main(string[] args)
{
var dict = new Dictionary<Pair<int, MyClass>, object>();
dict.Add(new Pair<int, MyClass>() { First = 1, Second = new MyClass(1, 2) }, 1);
//this is a pair of the same values as previous! but... no exception this time...
dict.Add(new Pair<int, MyClass>() { First = 1, Second = new MyClass(1, 3) }, 1);
return;
}
}
}
回答by Dan Blair
I don't have the book reference anymore, and I'll have to find it just to confirm, but I thought the default base hash just hashed together all of the members of your object. It got access to them because of the way the CLR worked, so it wasn't something that you could write as well as they had.
我没有这本书参考了,我必须找到它来确认,但我认为默认的基本哈希只是将对象的所有成员散列在一起。由于 CLR 的工作方式,它可以访问它们,所以它不是你可以像他们那样编写的东西。
That is completely from memory of something I briefly read so take it for what you will.
那完全是根据我简要阅读的内容的记忆,因此请随心所欲。
Edit:The book was Inside C#from MS Press. The one with the Saw blade on the cover. The author spent a good deal of time explaining how things were implemented in the CLR, how the language translated down to MSIL, ect. ect. If you can find the book it's not a bad read.
编辑:这本书是MS Press 的Inside C#。盖子上有锯片的那个。作者花了大量时间解释 CLR 中的事情是如何实现的,语言如何翻译成 MSIL 等。等。如果你能找到这本书,那就不失为一本好书。
Edit:Form the link provided it looks like
编辑:形成链接,只要它看起来像
Object.GetHashCode() uses an internal field in the System.Object class to generate the hash value. Each object created is assigned a unique object key, stored as an integer,when it is created. These keys start at 1 and increment every time a new object of any type gets created.
Object.GetHashCode() 使用 System.Object 类中的内部字段来生成哈希值。创建的每个对象都会分配一个唯一的对象键,在创建时存储为整数。这些键从 1 开始并在每次创建任何类型的新对象时递增。
Hmm I guess I need to write a few of my own hash codes, if I expect to use objects as hash keys.
嗯,如果我希望使用对象作为哈希键,我想我需要编写一些我自己的哈希码。
回答by Pop Catalin
Don't override GetHashcode() and Equals() on mutable classes, only override it on immutable classes or structures, else if you modify a object used as key the hash table won't function properly anymore (you won't be able to retrieve the value associated to the key after the key object was modified)
不要在可变类上覆盖 GetHashcode() 和 Equals(),只在不可变类或结构上覆盖它,否则如果您修改用作键的对象,哈希表将不再正常运行(您将无法在键对象被修改后检索与键关联的值)
Also hash tables don't use hashcodes to identify objects they use the key objects themselfes as identifiers, it's not required that all keys that are used to add entries in a hash table return different hashcodes, but it is recommended that they do, else performance suffers greatly.
此外,哈希表不使用哈希码来标识对象,它们使用密钥对象本身作为标识符,不需要用于在哈希表中添加条目的所有键都返回不同的哈希码,但建议他们这样做,否则性能深受其害。
回答by Scott Dorman
Check out this postby Brad Abrams and also the comment by Brian Grunkemeyer for some more information on how object.GetHashCode works. Also, take a look at the first comment on Ayande's blog post. I don't know if the current releases of the Framework still follow these rules or if they have actually changed it like Brad implied.
查看Brad Abrams 的这篇文章以及 Brian Grunkemeyer 的评论,了解有关 object.GetHashCode 如何工作的更多信息。另外,请查看 Ayande 博客文章的第一条评论。我不知道框架的当前版本是否仍然遵循这些规则,或者他们是否真的像 Brad 暗示的那样改变了它。
回答by Cory R. King
so probably it won't be aware of the hash codes of it's "children".
所以它可能不会知道它的“孩子”的哈希码。
Your example seems to prove otherwise :-) The hash code for the key MyClass
and the value 1
is the same for both KeyValuePair
's . The KeyValuePair implementation must be using both its Key
and Value
for its own hash code
您的示例似乎证明并非如此:-) 键MyClass
和值的哈希码1
对于两者都是相同KeyValuePair
的。KeyValuePair 实现必须同时使用它的Key
和Value
它自己的哈希码
Moving up, the dictionary class wants unique keys. It is using the hashcode provided by each key to figure things out. Remember that the runtime isn't calling Object.GetHashCode()
, but it is calling the GetHashCode() implementation provided by the instance you give it.
向上移动,字典类需要唯一键。它使用每个键提供的哈希码来解决问题。请记住,运行时不是调用Object.GetHashCode()
,而是调用由您提供的实例提供的 GetHashCode() 实现。
Consider a more complex case:
考虑一个更复杂的情况:
public class HappyClass
{
enum TheUnit
{
Points,
Picas,
Inches
}
class MyDistanceClass
{
int distance;
TheUnit units;
public MyDistanceClass(int theDistance, TheUnit unit)
{
distance = theDistance;
units = unit;
}
public static int ConvertDistance(int oldDistance, TheUnit oldUnit, TheUnit newUnit)
{
// insert real unit conversion code here :-)
return oldDistance * 100;
}
/// <summary>
/// Figure out if we are equal distance, converting into the same units of measurement if we have to
/// </summary>
/// <param name="obj">the other guy</param>
/// <returns>true if we are the same distance</returns>
public override bool Equals(object obj)
{
MyDistanceClass other = obj as MyDistanceClass;
if (other == null) return false;
if (other.units != this.units)
{
int newDistance = MyDistanceClass.ConvertDistance(other.distance, other.units, this.units);
return distance.Equals(newDistance);
}
else
{
return distance.Equals(other.distance);
}
}
public override int GetHashCode()
{
// even if the distance is equal in spite of the different units, the objects are not
return distance.GetHashCode() * units.GetHashCode();
}
}
static void Main(string[] args)
{
// these are the same distance... 72 points = 1 inch
MyDistanceClass distPoint = new MyDistanceClass(72, TheUnit.Points);
MyDistanceClass distInch = new MyDistanceClass(1, TheUnit.Inch);
Debug.Assert(distPoint.Equals(distInch), "these should be true!");
Debug.Assert(distPoint.GetHashCode() != distInch.GetHashCode(), "But yet they are fundimentally different values");
Dictionary<object, object> dict = new Dictionary<object, object>();
dict.Add(new KeyValuePair<MyDistanceClass, object>(distPoint, 1), 1);
//this should not barf
dict.Add(new KeyValuePair<MyDistanceClass, object>(distInch, 1), 1);
return;
}
}
Basically... in the case of my example, you'd want two objects that are the same distance to return "true" for Equals, but yet return different hash codes.
基本上......在我的示例中,您希望距离相同的两个对象为 Equals 返回“true”,但返回不同的哈希码。
回答by Rinat Abdullin
Here are the proper Hash and equality implementations for the Quad tuple (contains 4 tuple components inside). This code ensures proper usage of this specific tuple in HashSets and the dictionaries.
以下是 Quad 元组(包含 4 个元组组件)的正确哈希和相等实现。此代码可确保在 HashSets 和字典中正确使用此特定元组。
More on the subject (including the source code) here.
有关该主题的更多信息(包括源代码),请点击此处。
Noteusage of the uncheckedkeyword (to avoid overflows) and throwing NullReferenceException if obj is null (as required by the base method)
注意unchecked关键字的用法(以避免溢出),如果 obj 为 null,则抛出 NullReferenceException(根据基本方法的要求)
public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj))
throw new NullReferenceException("obj is null");
if (ReferenceEquals(this, obj)) return true;
if (obj.GetType() != typeof (Quad<T1, T2, T3, T4>)) return false;
return Equals((Quad<T1, T2, T3, T4>) obj);
}
public bool Equals(Quad<T1, T2, T3, T4> obj)
{
if (ReferenceEquals(null, obj)) return false;
if (ReferenceEquals(this, obj)) return true;
return Equals(obj.Item1, Item1)
&& Equals(obj.Item2, Item2)
&& Equals(obj.Item3, Item3)
&& Equals(obj.Item4, Item4);
}
public override int GetHashCode()
{
unchecked
{
int result = Item1.GetHashCode();
result = (result*397) ^ Item2.GetHashCode();
result = (result*397) ^ Item3.GetHashCode();
result = (result*397) ^ Item4.GetHashCode();
return result;
}
}
public static bool operator ==(Quad<T1, T2, T3, T4> left, Quad<T1, T2, T3, T4> right)
{
return Equals(left, right);
}
public static bool operator !=(Quad<T1, T2, T3, T4> left, Quad<T1, T2, T3, T4> right)
{
return !Equals(left, right);
}