C# 一致地生成对象的哈希

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12393467/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-09 23:06:48  来源:igfitidea点击:

Generate hash of object consistently

c#.nethash

提问by Alex

I'm trying to get a hash (md5 or sha) of an object.

我正在尝试获取对象的哈希(md5 或 sha)。

I've implemented this: http://alexmg.com/post/2009/04/16/Compute-any-hash-for-any-object-in-C.aspx

我已经实现了这个:http: //alexmg.com/post/2009/04/16/Compute-any-hash-for-any-object-in-C.aspx

I'm using nHibernate to retrieve my POCOs from a database.
When running GetHash on this, it's different each time it's selected and hydrated from the database. I guess this is expected, as the underlying proxies will change.

我正在使用 nHibernate 从数据库中检索我的 POCO。
在此上运行 GetHash 时,每次从数据库中选择和水合它时都是不同的。我想这是意料之中的,因为基础代理会发生变化。

Anyway,

反正,

Is there a way to get a hash of all the properties on an object, consistently each time?

有没有办法每次都一致地获取对象上所有属性的散列?

I've toyed with the idea of using a StringBuilder over this.GetType().GetProperties..... and creating a hash on that, but that seems inefficient?

我玩过在 this.GetType().GetProperties 上使用 StringBuilder 并在其上创建哈希的想法,但这似乎效率低下?

As a side note, this is for change-tracking these entities from one database (RDBMS) to a NoSQL store (comparing hash values to see if objects changed between rdbms and nosql)

作为旁注,这是为了将这些实体从一个数据库 (RDBMS) 更改为跟踪到 NoSQL 存储(比较哈希值以查看对象是否在 rdbms 和 nosql 之间更改)

采纳答案by Peter Ritchie

If you're not overriding GetHashCodeyou just inherit Object.GetHashCode. Object.GetHashCodebasically just returns the memory address of the instance, if it's a reference object. Of course, each time an object is loaded it will likely be loaded into a different part of memory and thus result in a different hash code.

如果您不覆盖GetHashCode,则只需继承Object.GetHashCode. Object.GetHashCode基本上只返回实例的内存地址,如果它是一个引用对象。当然,每次加载一个对象时,它很可能会被加载到内存的不同部分,从而导致不同的哈希码。

It's debatable whether that's the correct thing to do; but that's what was implemented "back in the day" so it can't change now.

这是否是正确的做法值得商榷。但那是“过去”实施的,所以现在不能改变。

If you want something consistent then you have to override GetHashCodeand create a code based on the "value" of the object (i.e. the properties and/or fields). This can be as simple as a distributed merging of the hash codes of all the properties/fields. Or, it could be as complicated as you need it to be. If all you're looking for is something to differentiate two different objects, then using a unique key on the object might work for you.If you're looking for change tracking, using the unique key for the hash probably isn't going to work

如果您想要一致的东西,那么您必须覆盖GetHashCode并基于对象的“值”(即属性和/或字段)创建代码。这可以像分布式合并所有属性/字段的哈希码一样简单。或者,它可以像您需要的那样复杂。 如果您正在寻找的只是区分两个不同对象的东西,那么在对象上使用唯一键可能对您有用。如果您正在寻找更改跟踪,则使用哈希的唯一键可能不起作用

I simply use all the hash codes of the fields to create a reasonably distributed hash code for the parent object. For example:

我只是使用字段的所有哈希码来为父对象创建一个合理分布的哈希码。例如:

public override int GetHashCode()
{
    unchecked
    {
        int result = (Name != null ? Name.GetHashCode() : 0);
        result = (result*397) ^ (Street != null ? Street.GetHashCode() : 0);
        result = (result*397) ^ Age;
        return result;
    }
}

The use of the prime number 397 is to generate a unique number for a value to better distribute the hash code. See http://computinglife.wordpress.com/2008/11/20/why-do-hash-functions-use-prime-numbers/for more details on the use of primes in hash code calculations.

素数 397 的用途是为一个值生成一个唯一的数字,以便更好地分发哈希码。有关在哈希码计算中使用素数的更多详细信息,请参阅http://computinglife.wordpress.com/2008/11/20/why-do-hash-functions-use-prime-numbers/

You could, of course, use reflection to get at all the properties to do this, but that would be slower. Alternatively you could use the CodeDOMto generate code dynamically to generate the hash based on reflecting on the properties and cache that code (i.e. generate it once and reload it next time). But, this of course, is very complex and might not be worth the effort.

当然,您可以使用反射来获取所有属性来执行此操作,但这会更慢。或者,您可以使用CodeDOM动态生成代码,以基于对属性的反映并缓存该代码(即生成一次,下次重新加载)来生成哈希。但是,这当然非常复杂,可能不值得付出努力。

An MD5 or SHA hash or CRC is generally based on a block of data. If you want that, then using the hash code of each property doesn't make sense. Possibly serializing the data to memory and calculating the hash that way would be more applicable, as Henk describes.

MD5 或 SHA 散列或 CRC 通常基于数据块。如果你想要那样,那么使用每个属性的哈希码是没有意义的。正如 Henk 所描述的那样,可能将数据序列化到内存并以这种方式计算哈希值会更适用。

回答by Rich O'Kelly

If this 'hash' is solely used to determine whether entities have changed then the following algorithm may help (NB it is untested and assumes that the same runtime will be used when generating hashes (otherwise the reliance on GetHashCode for 'simple' types is incorrect)):

如果此“哈希”仅用于确定实体是否已更改,则以下算法可能会有所帮助(注意,它未经测试,并假设在生成哈希时将使用相同的运行时(否则,“简单”类型对 GetHashCode 的依赖是不正确的) )):

public static byte[] Hash<T>(T entity) 
{
  var seen = new HashSet<object>();
  var properties = GetAllSimpleProperties(entity, seen);
  return properties.Select(p => BitConverter.GetBytes(p.GetHashCode()).AsEnumerable()).Aggregate((ag, next) => ag.Concat(next)).ToArray();
}

private static IEnumerable<object> GetAllSimpleProperties<T>(T entity, HashSet<object> seen)
{
  foreach (var property in PropertiesOf<T>.All(entity))
  {
    if (property is int || property is long || property is string ...) yield return property;
    else if (seen.Add(property)) // Handle cyclic references
    {
      foreach (var simple in GetAllSimpleProperties(property, seen)) yield return simple;
    }
  }
}

private static class PropertiesOf<T>
{
  private static readonly List<Func<T, dynamic>> Properties = new List<Func<T, dynamic>>();

  static PropertiesOf()
  {
    foreach (var property in typeof(T).GetProperties())
    {
      var getMethod = property.GetGetMethod();
      var function = (Func<T, dynamic>)Delegate.CreateDelegate(typeof(Func<T, dynamic>), getMethod);
      Properties.Add(function);
    }
  }

  public static IEnumerable<dynamic> All(T entity) 
  {
    return Properties.Select(p => p(entity)).Where(v => v != null);
  }
} 

This would then be useable like so:

然后可以像这样使用:

var entity1 = LoadEntityFromRdbms();
var entity2 = LoadEntityFromNoSql();
var hash1 = Hash(entity1);
var hash2 = Hash(entity2);
Assert.IsTrue(hash1.SequenceEqual(hash2));

回答by paparazzo

GetHashCode() returns an Int32 (not an MD5).

GetHashCode() 返回一个 Int32(不是 MD5)。

If you create two objects with all the same property values they will not have the same Hash if you use the base or system GetHashCode().

如果您创建两个具有所有相同属性值的对象,如果您使用基础或系统 GetHashCode(),它们将不会具有相同的 Hash。

String is an object and an exception.

字符串是一个对象,也是一个例外。

string s1 = "john";
string s2 = "john";
if (s1 == s2) returns true and will return the same GetHashCode()

If you want to control equality comparison of two objects then you should override the GetHash and Equality.

如果要控制两个对象的相等比较,则应覆盖 GetHash 和 Equality。

If two object are the same then they must also have the same GetHash(). But two objects with the same GetHash() are not necessarily the same. A comparison will first test the GetHash() and if it gets a match there it will test the Equals. OK there are some comparisons that go straight to Equals but you should still override both and make sure two identical objects produce the same GetHash.

如果两个对象相同,则它们也必须具有相同的 GetHash()。但是具有相同 GetHash() 的两个对象不一定相同。比较将首先测试 GetHash(),如果在那里匹配,它将测试 Equals。好的,有一些比较直接进行 Equals,但您仍然应该覆盖两者并确保两个相同的对象产生相同的 GetHash。

I use this for syncing a client with the server. You could use all the Properties or you could have any Property change change the VerID. The advantage here is a simpler quicker GetHashCode(). In my case I was resetting the VerID with any Property change already.

我用它来同步客户端与服务器。您可以使用所有属性,也可以让任何属性更改更改 VerID。这里的优点是更简单更快的 GetHashCode()。就我而言,我已经通过任何属性更改来重置 VerID。

    public override bool Equals(Object obj)
    {
        //Check for null and compare run-time types.
        if (obj == null || !(obj is FTSdocWord)) return false;
        FTSdocWord item = (FTSdocWord)obj;
        return (OjbID == item.ObjID && VerID == item.VerID);
    }
    public override int GetHashCode()
    {
        return ObjID ^ VerID;
    }

I ended up using ObjID alone so I could do the following

我最终单独使用 ObjID,因此我可以执行以下操作

if (myClientObj == myServerObj && myClientObj.VerID <> myServerObj.VerID)
{
   // need to synch
}

Object.GetHashCode Method

Object.GetHashCode 方法

Two objects with the same property values. Are they equal? Do they produce the same GetHashCode()?

具有相同属性值的两个对象。他们平等吗?它们是否产生相同的 GetHashCode()?

            personDefault pd1 = new personDefault("John");
            personDefault pd2 = new personDefault("John");
            System.Diagnostics.Debug.WriteLine(po1.GetHashCode().ToString());
            System.Diagnostics.Debug.WriteLine(po2.GetHashCode().ToString()); 
            // different GetHashCode
            if (pd1.Equals(pd2))  // returns false
            {
                System.Diagnostics.Debug.WriteLine("pd1 == pd2");
            }
            List<personDefault> personsDefault = new List<personDefault>();
            personsDefault.Add(pd1);
            if (personsDefault.Contains(pd2))  // returns false
            {
                System.Diagnostics.Debug.WriteLine("Contains(pd2)");
            }

            personOverRide po1 = new personOverRide("John");
            personOverRide po2 = new personOverRide("John");
            System.Diagnostics.Debug.WriteLine(po1.GetHashCode().ToString());
            System.Diagnostics.Debug.WriteLine(po2.GetHashCode().ToString());  
            // same hash
            if (po1.Equals(po2))  // returns true
            {
                System.Diagnostics.Debug.WriteLine("po1 == po2");
            }
            List<personOverRide> personsOverRide = new List<personOverRide>();
            personsOverRide.Add(po1);
            if (personsOverRide.Contains(po2))  // returns true
            {
                System.Diagnostics.Debug.WriteLine("Contains(p02)");
            }
        }



        public class personDefault
        {
            public string Name { get; private set; }
            public personDefault(string name) { Name = name; }
        }

        public class personOverRide: Object
        {
            public string Name { get; private set; }
            public personOverRide(string name) { Name = name; }

            public override bool Equals(Object obj)
            {
                //Check for null and compare run-time types.
                if (obj == null || !(obj is personOverRide)) return false;
                personOverRide item = (personOverRide)obj;
                return (Name == item.Name);
            }
            public override int GetHashCode()
            {
                return Name.GetHashCode();
            }
        }