.net 创建两个数字的哈希码
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/892618/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Create a hashcode of two numbers
提问by JDunkerley
I am trying to create a quick hashcode function for a complex number class (a + b)in C#.
我正在尝试为(a + b)C# 中的复数类创建一个快速哈希码函数。
I have seen repeatedly the a.GetHashcode()^b.GetHashCode()method.
But this will give the same hashcode for (a,b)and (b,a).
我反复看到了这个a.GetHashcode()^b.GetHashCode()方法。但是,这将给予相同的哈希码(a,b)和(b,a)。
Are there any standard algorithm to do this and are there any functions in the .Net framework to help?
是否有任何标准算法可以执行此操作,并且 .Net 框架中是否有任何功能可以提供帮助?
回答by Jon Skeet
My normal way of creating a hashcode for an arbitrary set of hashable items:
我为任意一组可散列项创建散列码的常规方法:
int hash = 23;
hash = hash * 31 + item1Hash;
hash = hash * 31 + item2Hash;
hash = hash * 31 + item3Hash;
hash = hash * 31 + item4Hash;
hash = hash * 31 + item5Hash;
// etc
In your case item1Hashcould just be a, and item2Hashcould just be b.
在你的情况下item1Hash可能只是a,并且item2Hash可能只是b。
The values of 23 and 31 are relatively unimportant, so long as they're primes (or at least coprime).
23 和 31 的值相对不重要,只要它们是素数(或至少是互素数)。
Obviously there will still be collisions, but you don't run into the normal nasty problems of:
显然仍然会有冲突,但您不会遇到以下常见的令人讨厌的问题:
hash(a, a) == hash(b, b)
hash(a, b) == hash(b, a)
If you know more about what the real values of aand bare likely to be you can probably do better, but this is a good initial implementation which is easy to remember and implement. Note that if there's any chance that you'll build the assembly with "check for arithmetic overflow/underflow" ticked, you should put it all in an unchecked block. (Overflow is fine for this algorithm.)
如果您更多地了解真正的价值a和b可能是什么,您可能会做得更好,但这是一个很好的初始实现,易于记忆和实现。请注意,如果您有可能在“检查算术溢出/下溢”的情况下构建程序集,则应将其全部放在未检查的块中。(溢出适用于该算法。)
回答by Noldorin
Here's a possible approach that takes into account order. (The second method is defined as an extension method.)
这是一种考虑顺序的可能方法。(第二种方法定义为扩展方法。)
public int GetHashCode()
{
return a.GetHashcode() ^ b.GetHashcode().RotateLeft(16);
}
public static uint RotateLeft(this uint value, int count)
{
return (value << count) | (value >> (32 - count))
}
It would certainly be interesting to see how the Complexclass of .NET 4.0 does it.
看看Complex.NET 4.0的类如何做到这一点肯定会很有趣。
回答by Lasse V. Karlsen
One standard way is this:
一种标准方法是这样的:
hashcode = 23
hashcode = (hashcode * 37) + v1
hashcode = (hashcode * 37) + v2
23 and 37 are coprime, but you can use other numbers as well.
23 和 37 是互质的,但您也可以使用其他数字。
回答by Welbog
What about this:
那这个呢:
(a.GetHashcode() + b).GetHashcode()
Gives you a different code for (a,b) and (b,a) plus it's not really that fancy.
为 (a,b) 和 (b,a) 提供不同的代码,而且它并不是那么花哨。
回答by Stephen Swensen
@JonSkeet gives a fair, general-purpose algorithm for computing a hash code from n hash codes but assumes you already know which members of an object need to be hash, know what to do about null members, and ommits an implementation for n arbitrary items. So we expand upon his answer:
@JonSkeet 提供了一种公平的通用算法,用于从 n 个散列码计算散列码,但假设您已经知道对象的哪些成员需要散列,知道如何处理空成员,并省略了 n 个任意项的实现. 所以我们扩展了他的回答:
- Only public, immutable properties and fields should contribute to an objects hash code. They should be public (or isomorphic to the public) since we should be able to count on two objects with the same visible surface having the same hash code (hinting towards relationship between object equality and hash code equality), and they should be immutable since an object's hash code should never change in its life time (since then you might end up with an object in the wrong slot of a hash table!).
- null members should hash as a constant, such as 0
- @JonSkeet's algorithm is a text-book example for applying the functional programming higher-order function usually called
fold(Aggregatein C# LINQ), where23is our seed and<hash accumulator> * 31 + <current item hash>is our folding function:
- 只有公共的、不可变的属性和字段应该对对象哈希码有贡献。它们应该是公开的(或同构于公开的),因为我们应该能够指望具有相同可见表面的两个对象具有相同的哈希码(暗示对象相等性和哈希码相等性之间的关系),并且它们应该是不可变的,因为对象的哈希码在其生命周期内永远不应更改(从那时起,您可能会在哈希表的错误插槽中找到对象!)。
- null 成员应散列为常量,例如 0
- @JonSkeet 的算法是一个教科书示例,用于应用通常称为
fold(Aggregate在 C# LINQ 中)的函数式编程高阶函数,其中23是我们的种子,<hash accumulator> * 31 + <current item hash>是我们的折叠函数:
In F#
在 F#
let computeHashCode items =
items
|> Seq.map (fun item -> if item = null then 0 else item.GetHashCode())
|> Seq.fold (fun hash itemHash -> hash * 31 + itemHash) 23
In C#
在 C# 中
Func<IEnumerable<Object>, int> computeHashCode = items =>
items
.Select(item => item == null ? 0 : item.GetHashCode())
.Aggregate(23, (hash, itemHash) => hash * 31 + itemHash);
回答by nawfal
All that depends on what you're trying to achieve. If hashes are meant for hash structures like Dictionary, then you have to balance collision rate and speed of hashing. To have a perfect hash without collision at all it will be more time consuming. Similarly the fastest hashing algorithm will have more collisions relatively. Finding the perfect balance is the key here. Also you should take into consideration how large your effective hash can be, and if hashing should be reversible! Noldorin's approach gives you perfect hash (read no collision) if your real and imaginary parts of your complex number are always positive. This will do even for negative numbers if you're ok with the rare collisions. But I'm concerned over the range of values it can yield, quite big for my taste.
所有这一切都取决于您要实现的目标。如果散列用于像 那样的散列结构Dictionary,那么您必须平衡碰撞率和散列速度。要获得完全没有冲突的完美散列,将更加耗时。同样,最快的散列算法相对会产生更多的冲突。找到完美的平衡是这里的关键。此外,您应该考虑有效散列的大小,以及散列是否应该是可逆的!如果您的复数的实部和虚部始终为正数,Noldorin 的方法将为您提供完美的散列(读取无冲突)。如果您对罕见的碰撞感到满意,这甚至适用于负数。但我担心它可以产生的价值范围,这对我来说相当大。
If you're after perfect hashes (out of some academic/research interests) that should work even for negative numbers, you can see this solution(and an array of other solutions in the same thread). In my tests, it is faster and utilizes space better than any other I have seen.
如果您追求完美的散列(出于一些学术/研究兴趣),即使对于负数也应该适用,您可以看到此解决方案(以及同一线程中的一系列其他解决方案)。在我的测试中,它比我见过的任何其他产品都更快,并且可以更好地利用空间。

