objective-c 覆盖 isEqual: 和 hash 的最佳实践

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/254281/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-03 20:52:40  来源:igfitidea点击:

Best practices for overriding isEqual: and hash

objective-cequality

提问by Dave Dribin

How do you properly override isEqual:in Objective-C? The "catch" seems to be that if two objects are equal (as determined by the isEqual:method), they must have the same hash value.

你如何isEqual:在 Objective-C 中正确覆盖?“捕获”似乎是如果两个对象相等(由isEqual:方法确定),则它们必须具有相同的哈希值。

The Introspectionsection of the Cocoa Fundamentals Guidedoes have an example on how to override isEqual:, copied as follows, for a class named MyWidget:

Cocoa Fundamentals GuideIntrospection部分确实有一个关于如何覆盖 的示例,复制如下,对于名为 的类:isEqual:MyWidget

- (BOOL)isEqual:(id)other {
    if (other == self)
        return YES;
    if (!other || ![other isKindOfClass:[self class]])
        return NO;
    return [self isEqualToWidget:other];
}

- (BOOL)isEqualToWidget:(MyWidget *)aWidget {
    if (self == aWidget)
        return YES;
    if (![(id)[self name] isEqual:[aWidget name]])
        return NO;
    if (![[self data] isEqualToData:[aWidget data]])
        return NO;
    return YES;
}

It checks pointer equality, then class equality, and finally compares the objects using isEqualToWidget:, which only checks the nameand dataproperties. What the example doesn'tshow is how to override hash.

它先检查指针相等性,然后是类相等性,最后使用 比较对象isEqualToWidget:,它只检查namedata属性。该示例显示的是如何覆盖hash.

Let's assume there are other properties that do not affect equality, say age. Shouldn't the hashmethod be overridden such that only nameand dataaffect the hash? And if so, how would you do that? Just add the hashes of nameand data? For example:

让我们假设还有其他不影响平等的属性,比如age。如果没有hash方法被覆盖,使得只有namedata影响哈希?如果是这样,你会怎么做?只需添加的哈希namedata?例如:

- (NSUInteger)hash {
    NSUInteger hash = 0;
    hash += [[self name] hash];
    hash += [[self data] hash];
    return hash;
}

Is that sufficient? Is there a better technique? What if you have primitives, like int? Convert them to NSNumberto get their hash? Or structs like NSRect?

够了吗?有没有更好的技术?如果你有原语,比如int? 将它们转换NSNumber为获取它们的哈希值?或者像这样的结构NSRect

(Brain fart: Originally wrote "bitwise OR" them together with |=. Meant add.)

脑放屁:最初将它们与 一起写成“按位或” |=。意思是添加。)

采纳答案by tcurdt

Start with

从...开始

 NSUInteger prime = 31;
 NSUInteger result = 1;

Then for every primitive you do

然后对于你做的每一个原语

 result = prime * result + var

For objects you use 0 for nil and otherwise their hashcode.

对于对象,您使用 0 表示 nil,否则使用它们的哈希码。

 result = prime * result + [var hash];

For booleans you use two different values

对于布尔值,您使用两个不同的值

 result = prime * result + ((var)?1231:1237);


Explanation and Attribution

解释和归属

This is not tcurdt's work, and comments were asking for more explanation, so I believe an edit for attribution is fair.

这不是 tcurdt 的工作,评论要求更多解释,所以我相信对归属的编辑是公平的。

This algorithm was popularized in the book "Effective Java", and the relevant chapter can currently be found online here. That book popularized the algorithm, which is now a default in a number of Java applications (including Eclipse). It derived, however, from an even older implementation which is variously attributed to Dan Bernstein or Chris Torek. That older algorithm originally floated around on Usenet, and certain attribution is difficult. For example, there is some interesting commentary in this Apache code(search for their names) that references the original source.

该算法在《Effective Java》一书中得到了普及,目前可以在此处在线找到相关章节。那本书普及了该算法,该算法现在是许多 Java 应用程序(包括 Eclipse)的默认设置。然而,它源自一个更旧的实现,该实现被不同地归因于 Dan Bernstein 或 Chris Torek。那个较旧的算法最初在 Usenet 上流传,并且某些归因是困难的。例如,在这个 Apache 代码(搜索他们的名字)中有一些有趣的评论引用了原始来源。

Bottom line is, this is a very old, simple hashing algorithm. It is not the most performant, and it is not even proven mathematically to be a "good" algorithm. But it is simple, and a lot of people have used it for a long time with good results, so it has a lot of historical support.

底线是,这是一个非常古老、简单的散列算法。它不是性能最好的,甚至在数学上也没有证明它是一个“好”的算法。但是它很简单,而且很多人长期使用,效果很好,所以有很多历史支持。

回答by Brian B.

I'm just picking up Objective-C myself, so I can't speak for that language specifically, but in the other languages I use if two instances are "Equal" they must return the same hash - otherwise you are going to have all sorts of problems when trying to use them as keys in a hashtable (or any dictionary-type collections).

我自己只是在学习 Objective-C,所以我不能专门用这种语言说话,但是在我使用的其他语言中,如果两个实例“相等”,它们必须返回相同的哈希值 - 否则你将拥有所有尝试将它们用作哈希表(或任何字典类型的集合)中的键时会出现各种问题。

On the other hand, if 2 instances are not equal, they may or may not have the same hash - it is best if they don't. This is the difference between an O(1) search on a hash table and an O(N) search - if all your hashes collide you may find that searching your table is no better than searching a list.

另一方面,如果 2 个实例不相等,则它们可能具有或可能不具有相同的哈希值 - 最好不要。这是哈希表上的 O(1) 搜索和 O(N) 搜索之间的区别 - 如果所有哈希冲突,您可能会发现搜索表并不比搜索列表好。

In terms of best practices your hash should return a random distribution of values for its input. This means that, for example, if you have a double, but the majority of your values tend to cluster between 0 and 100, you need to make sure that the hashes returned by those values are evenly distributed across the entire range of possible hash values. This will significantly improve your performance.

就最佳实践而言,您的哈希应该为其输入返回值的随机分布。这意味着,例如,如果您有一个双精度值,但您的大多数值往往集中在 0 到 100 之间,您需要确保这些值返回的哈希值均匀分布在整个可能的哈希值范围内. 这将显着提高你的表现。

There are a number of hashing algorithms out there, including several listed here. I try to avoid creating new hash algorithms as it can have large performance implications, so using the existing hash methods and doing a bitwise combination of some sort as you do in your example is a good way to avoid it.

有许多散列算法,包括此处列出的几种。我尽量避免创建新的哈希算法,因为它可能会对性能产生很大的影响,因此使用现有的哈希方法并像您在示例中所做的那样进行某种按位组合是避免它的好方法。

回答by Yariv Nissim

A simple XOR over the hash values of critical properties is sufficient 99% of the time.

在 99% 的情况下,对关键属性的哈希值进行简单的异或就足够了。

For example:

例如:

- (NSUInteger)hash
{
    return [self.name hash] ^ [self.data hash];
}

Solution found at http://nshipster.com/equality/by Mattt Thompson (which also referred this question in his post!)

Mattt Thompson在http://nshipster.com/equality/ 上找到的解决方案(他也在他的帖子中提到了这个问题!)

回答by LavaSlider

I found this thread extremely helpful supplying everything I needed to get my isEqual:and hashmethods implemented with one catch. When testing object instance variables in isEqual:the example code uses:

我发现我需要让我的这个主题非常有帮助提供一切isEqual:hash一个抓落实的方法。在isEqual:示例代码中测试对象实例变量时使用:

if (![(id)[self name] isEqual:[aWidget name]])
    return NO;

This repeatedly failed (i.e., returned NO) without and error, when I knewthe objects were identical in my unit testing. The reason was, one of the NSStringinstance variables was nilso the above statement was:

当我知道对象在我的单元测试中是相同的时,这反复失败(返回NO)而没有错误。原因是,其中一个实例变量为零,所以上面的语句是:NSString

if (![nil isEqual: nil])
    return NO;

and since nilwill respond to any method, this is perfectly legal but

由于nil会响应任何方法,这是完全合法的,但是

[nil isEqual: nil]

returns nil, which is NO, so when both the object and the one being tested had a nilobject they would be considered not equal (i.e., isEqual:would return NO).

返回nil,它是NO,所以当对象和被测试的对象都有一个nil对象时,它们将被视为不相等(isEqual:将返回NO)。

This simple fix was to change the if statement to:

这个简单的修复是将 if 语句更改为:

if ([self name] != [aWidget name] && ![(id)[self name] isEqual:[aWidget name]])
    return NO;

This way, if their addresses are the same it skips the method call no matter if they are both nilor both pointing to the same object but if either is not nilor they point to different objects then the comparator is appropriately called.

这样,如果它们的地址相同,则无论它们都是nil还是都指向同一个对象,它都会跳过方法调用,但是如果它们不是nil或者它们指向不同的对象,则比较器会被适当地调用。

I hope this saves someone a few minutes of head scratching.

我希望这可以为某人节省几分钟的时间。

回答by Paul Solt

The hash function should create a semi-unique value that is not likely to collide or match another object's hash value.

散列函数应该创建一个不太可能与另一个对象的散列值冲突或匹配的半唯一值。

Here is the full hash function, which can be adapted to your classes instance variables. It uses NSUInteger's rather than int for compatibility on 64/32bit applications.

这是完整的哈希函数,它可以适应您的类实例变量。它使用 NSUInteger 而不是 int 来兼容 64/32 位应用程序。

If the result becomes 0 for different objects, you run the risk of colliding hashes. Colliding hashes can result in unexpected program behavior when working with some of the collection classes that depend on the hash function. Make sure to test your hash function prior to use.

如果不同对象的结果变为 0,则存在哈希冲突的风险。在使用某些依赖于散列函数的集合类时,碰撞散列可能会导致意外的程序行为。确保在使用前测试您的哈希函数。

-(NSUInteger)hash {
    NSUInteger result = 1;
    NSUInteger prime = 31;
    NSUInteger yesPrime = 1231;
    NSUInteger noPrime = 1237;

    // Add any object that already has a hash function (NSString)
    result = prime * result + [self.myObject hash];

    // Add primitive variables (int)
    result = prime * result + self.primitiveVariable; 

    // Boolean values (BOOL)
    result = prime * result + (self.isSelected?yesPrime:noPrime);

    return result;
}

回答by Jens Ayton

The easy but inefficient way is to return the same -hashvalue for every instance. Otherwise, yes, you must implement hash based only on objects which affect equality. This is tricky if you use lax comparisons in -isEqual:(e.g. case-insensitive string comparisons). For ints, you can generally use the int itself, unless you'll be comparing with NSNumbers.

简单但效率低下的方法是-hash为每个实例返回相同的值。否则,是的,您必须仅基于影响相等性的对象实现哈希。如果您在-isEqual:(例如不区分大小写的字符串比较)中使用松散的比较,这会很棘手。对于整数,您通常可以使用整数本身,除非您要与 NSNumbers 进行比较。

Don't use |=, though, it will saturate. Use ^= instead.

不要使用 |=,不过,它会饱和。使用 ^= 代替。

Random fun fact: [[NSNumber numberWithInt:0] isEqual:[NSNumber numberWithBool:NO]], but [[NSNumber numberWithInt:0] hash] != [[NSNumber numberWithBool:NO] hash]. (rdar://4538282, open since 05-May-2006)

随机有趣的事实:[[NSNumber numberWithInt:0] isEqual:[NSNumber numberWithBool:NO]],但是[[NSNumber numberWithInt:0] hash] != [[NSNumber numberWithBool:NO] hash]。(rdar://4538282, 自 2006 年 5 月 5 日起开放)

回答by user4951

Remember that you only need to provide hash that's equal when isEqualis true. When isEqualis false, the hash doesn't have to be unequal though presumably it is. Hence:

请记住,您只需要提供等于isEqualtrue时的哈希值。当isEqual为 false 时,散列不必是不相等的,尽管可能是这样。因此:

Keep hash simple. Pick a member (or few members) variable that is the most distinctive.

保持哈希简单。选择最独特的成员(或少数成员)变量。

For example, for CLPlacemark, the name only is enough. Yes there are 2 or 3 distincts CLPlacemark with the exact same name but those are rare. Use that hash.

例如,对于 CLPlacemark,仅名称就足够了。是的,有 2 或 3 个不同的 CLPlacemark 具有完全相同的名称,但很少见。使用该哈希。

@interface CLPlacemark (equal)
- (BOOL)isEqual:(CLPlacemark*)other;
@end

@implementation CLPlacemark (equal)

...

...

-(NSUInteger) hash
{
    return self.name.hash;
}


@end

Notice I do not bother specifying the city, the country, etc. The name is enough. Perhaps the name and CLLocation.

请注意,我不费心指定城市、国家等。名称就足够了。也许是名称和 CLLocation。

Hash should be evenly distributed. So you can combine several members variable using the caret ^ (xor sign)

哈希应该均匀分布。因此,您可以使用插入符号 ^(xor 符号)组合多个成员变量

So it's something like

所以它就像

hash = self.member1.hash ^ self.member2.hash ^ self.member3.hash

That way the hash will be evenly distributed.

这样散列将均匀分布。

Hash must be O(1), and not O(n)

So what to do in array?

那么在数组中做什么呢?

Again, simple. You do not have to hash all members of the array. Enough to hash the first element, last element, the count, maybe some middle elements, and that's it.

再次,简单。您不必散列数组的所有成员。足以散列第一个元素,最后一个元素,计数,可能还有一些中间元素,就是这样。

回答by Jonathan Ellis

Hold on, surely a far easier way to do this is to first override - (NSString )descriptionand provide a string representation of your object state (you must represent the entire state of your object in this string).

- (NSString )description等等,当然更简单的方法是首先覆盖并提供对象状态的字符串表示(您必须在此字符串中表示对象的整个状态)。

Then, just provide the following implementation of hash:

然后,只需提供以下实现hash

- (NSUInteger)hash {
    return [[self description] hash];
}

This is based on the principle that "if two string objects are equal (as determined by the isEqualToString: method), they must have the same hash value."

这是基于“如果两个字符串对象相等(由 isEqualToString: 方法确定),它们必须具有相同的哈希值”的原则。

Source: NSString Class Reference

来源:NSString 类参考

回答by jedwidz

The equals and hash contracts are well specified and thoroughly researched in the Java world (see @mipardi's answer), but all the same considerations should apply to Objective-C.

在 Java 世界中对 equals 和 hash 契约进行了详细说明和彻底研究(请参阅@mipardi 的回答),但所有相同的考虑因素都应适用于 Objective-C。

Eclipse does a reliable job of generating these methods in Java, so here's an Eclipse example ported by hand to Objective-C:

Eclipse 在用 Java 生成这些方法方面做得很可靠,所以这里有一个手工移植到 Objective-C 的 Eclipse 示例:

- (BOOL)isEqual:(id)object {
    if (self == object)
        return true;
    if ([self class] != [object class])
        return false;
    MyWidget *other = (MyWidget *)object;
    if (_name == nil) {
        if (other->_name != nil)
            return false;
    }
    else if (![_name isEqual:other->_name])
        return false;
    if (_data == nil) {
        if (other->_data != nil)
            return false;
    }
    else if (![_data isEqual:other->_data])
        return false;
    return true;
}

- (NSUInteger)hash {
    const NSUInteger prime = 31;
    NSUInteger result = 1;
    result = prime * result + [_name hash];
    result = prime * result + [_data hash];
    return result;
}

And for a subclass YourWidgetwhich adds a property serialNo:

对于YourWidget添加属性的子类serialNo

- (BOOL)isEqual:(id)object {
    if (self == object)
        return true;
    if (![super isEqual:object])
        return false;
    if ([self class] != [object class])
        return false;
    YourWidget *other = (YourWidget *)object;
    if (_serialNo == nil) {
        if (other->_serialNo != nil)
            return false;
    }
    else if (![_serialNo isEqual:other->_serialNo])
        return false;
    return true;
}

- (NSUInteger)hash {
    const NSUInteger prime = 31;
    NSUInteger result = [super hash];
    result = prime * result + [_serialNo hash];
    return result;
}

This implementation avoids some subclassing pitfalls in the sample isEqual:from Apple:

此实现避免了isEqual:Apple示例中的一些子类化陷阱:

  • Apple's class test other isKindOfClass:[self class]is asymmetric for two different subclasses of MyWidget. Equality needs to be symmetric: a=b if and only if b=a. This could easily be fixed by changing the test to other isKindOfClass:[MyWidget class], then all MyWidgetsubclasses would be mutually comparable.
  • Using an isKindOfClass:subclass test prevents subclasses from overriding isEqual:with a refined equality test. This is because equality needs to be transitive: if a=b and a=c then b=c. If a MyWidgetinstance compares equal to two YourWidgetinstances, then those YourWidgetinstances must compare equal to each other, even if their serialNodiffers.
  • Apple 的类测试other isKindOfClass:[self class]对于 的两个不同子类是不对称的MyWidget。相等必须是对称的:a=b 当且仅当 b=a。这可以通过将测试更改为 轻松解决other isKindOfClass:[MyWidget class],然后所有MyWidget子类将相互比较。
  • 使用isKindOfClass:子类测试可以防止子类isEqual:被改进的相等性测试覆盖。这是因为等式必须是可传递的:如果 a=b 且 a=c,则 b=c。如果一个MyWidget实例比较等于两个YourWidget实例,那么这些YourWidget实例必须相互比较,即使它们serialNo不同。

The second issue can be fixed by only considering objects to be equal if they belong to the exact same class, hence the [self class] != [object class]test here. For typical application classes, this seems to be the best approach.

如果对象属于完全相同的类,则可以通过仅将对象视为相等来解决第二个问题,因此[self class] != [object class]这里进行测试。对于典型的应用程序类,这似乎是最好的方法。

However, there certainly are cases where the isKindOfClass:test is preferable. This is more typical of framework classesthan application classes. For example, any NSStringshould compare equal to any other other NSStringwith the same underlying character sequence, regardless of the NSString/NSMutableStringdistinction, and also regardless of what private classes in the NSStringclass cluster are involved.

然而,肯定有isKindOfClass:测试更可取的情况。框架类比应用程序类更典型。例如, anyNSString应该NSString与具有相同底层字符序列的任何其他比较相等,无论NSString/NSMutableString区别如何,也无论NSString涉及类簇中的哪些私有类。

In such cases, isEqual:should have well-defined, well-documented behaviour, and it should be made clear that subclasses can't override this. In Java, the 'no override' restriction can be enforced by flagging the equals and hashcode methods as final, but Objective-C has no equivalent.

在这种情况下,isEqual:应该有明确定义、有据可查的行为,并且应该明确子类不能覆盖它。在 Java 中,可以通过将 equals 和 hashcode 方法标记为 来强制执行“无覆盖”限制final,但 Objective-C 没有等效项。

回答by mipadi

I've found this pageto be a helpful guide in override equals- and hash-type methods. It includes a decent algorithm for calculating hash codes. The page is geared towards Java, but it's pretty easy to adapt it to Objective-C/Cocoa.

我发现这个页面是重写 equals 和 hash 类型方法的有用指南。它包括一个用于计算哈希码的体面算法。该页面面向 Java,但很容易使其适应 Objective-C/Cocoa。