如何在python中实现一个好的__hash__函数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4005318/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to implement a good __hash__ function in python
提问by abahgat
When implementing a class with multiple properties (like in the toy example below), what is the best way to handle hashing?
在实现具有多个属性的类时(如下面的玩具示例),处理散列的最佳方法是什么?
I guess that the __eq__and __hash__should be consistent, but how to implement a proper hash function that is capable of handling all the properties?
我想__eq__和__hash__应该是一致的,但是如何实现一个能够处理所有属性的正确哈希函数呢?
class AClass:
def __init__(self):
self.a = None
self.b = None
def __eq__(self, other):
return other and self.a == other.a and self.b == other.b
def __ne__(self, other):
return not self.__eq__(other)
def __hash__(self):
return hash((self.a, self.b))
I read on this questionthat tuples are hashable, so I was wondering if something like the example above was sensible. Is it?
我在这个问题上读到元组是可散列的,所以我想知道上面的例子是否合理。是吗?
采纳答案by adw
__hash__should return the same value for objects that are equal. It also shouldn't change over the lifetime of the object; generally you only implement it for immutable objects.
__hash__应该为相等的对象返回相同的值。它也不应该在对象的生命周期内改变;通常你只为不可变对象实现它。
A trivial implementation would be to just return 0. This is always correct, but performs badly.
一个简单的实现就是将return 0. 这总是正确的,但表现不佳。
Your solution, returning the hash of a tuple of properties, is good. But note that you don't need to list all properties that you compare in __eq__in the tuple. If some property usually has the same value for inequal objects, just leave it out. Don't make the hash computation any more expensive than it needs to be.
您的解决方案,返回属性元组的哈希值,很好。但请注意,您不需要__eq__在元组中列出您比较的所有属性。如果某些属性对于不相等的对象通常具有相同的值,则将其省略。不要让哈希计算比它需要的更昂贵。
Edit: I would recommend against using xor to mix hashes in general. When two different properties have the same value, they will have the same hash, and with xor these will cancel eachother out. Tuples use a more complex calculation to mix hashes, see tuplehashin tupleobject.c.
编辑:我建议一般不要使用 xor 来混合哈希。当两个不同的属性具有相同的值时,它们将具有相同的哈希值,并且使用异或它们会相互抵消。元组使用更复杂的计算混合散列,见tuplehash在tupleobject.c。
回答by S.Lott
Documentation for object.__hash__(self)
The only required property is that objects which compare equal have the same hash value; it is advised to somehow mix together (e.g. using exclusive or) the hash values for the components of the object that also play a part in comparison of objects.
唯一需要的属性是比较相等的对象具有相同的哈希值;建议以某种方式将对象组件的散列值混合在一起(例如使用异或),这些组件也在对象比较中起作用。
def __hash__(self):
return hash(self.a) ^ hash(self.b)
回答by max
It's dangerous to write
写字很危险
def __eq__(self, other):
return other and self.a == other.a and self.b == other.b
because if your rhs (i.e., other) object evaluates to boolean False, it will never compare as equal to anything!
因为如果您的 rhs (ie, other) 对象评估为布尔值 False,它将永远不会与任何内容进行比较!
In addition, you might want to double check if otherbelongs to the class or subclass of AClass. If it doesn't, you'll either get exception AttributeErroror a false positive (if the other class happens to have the same-named attributes with matching values). So I would recommend to rewrite __eq__as:
此外,您可能需要仔细检查是否other属于 的类或子类AClass。如果不是,您将得到异常AttributeError或误报(如果另一个类碰巧具有具有匹配值的同名属性)。所以我建议重写__eq__为:
def __eq__(self, other):
return isinstance(other, self.__class__) and self.a == other.a and self.b == other.b
If by any chance you want an unusually flexible comparison, which compares across unrelated classes as long as attributes match by name, you'd still want to at least avoid AttributeErrorand check that otherdoesn't have any additional attributes. How you do it depends on the situation (since there's no standard way to find all attributes of an object).
如果万一您想要一个异常灵活的比较,只要属性按名称匹配,就可以在不相关的类之间进行比较,您仍然希望至少避免AttributeError并检查other没有任何其他属性的情况。你怎么做取决于具体情况(因为没有找到一个对象的所有属性的标准方法)。

