java 如果我们只覆盖一个类中的 hashCode() 并在一个 Set 中使用它会发生什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/26302169/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 09:40:26  来源:igfitidea点击:

What happens if we override only hashCode() in a class and use it in a Set?

javacollectionsset

提问by kumar

This may not be the real world scenario but just curious to know what happens, below is the code.

这可能不是真实世界的场景,只是想知道会发生什么,下面是代码。

I am creating a set of object of class UsingSet. According to hashing concept in Java, when I first add object which contains "a", it will create a bucket with hashcode 97 and put the object inside it. Again when it encounters an object with "a", it will call the overridden hashcode method in the class UsingSet and it will get hashcode 97 so what is next?

我正在创建一组 class 对象UsingSet。根据Java中的散列概念,当我第一次添加包含“a”的对象时,它会创建一个哈希码为97的存储桶并将对象放入其中。同样,当它遇到一个带有“a”的对象时,它会调用类 UsingSet 中重写的 hashcode 方法,它会得到 hashcode 97 那么接下来是什么?

As I have not overridden equals method, the default implementation will return false. So where will be the Object with value "a" be kept, in the same bucket where the previous object with hashcode 97 kept? or will it create new bucket? anybody know how it will be stored internally?

由于我没有覆盖 equals 方法,默认实现将返回 false。那么具有值“a”的对象将被保存在哪个存储桶中,而前一个对象的哈希码为 97?还是会创建新的存储桶?有人知道它将如何在内部存储吗?

/* package whatever; // don't place package name! */

import java.util.*;
import java.lang.*;
import java.io.*;

class UsingSet {  

  String value;  

  public UsingSet(String value){  
    this.value = value;  
  }  

  public String toString() {  
    return value;  
  }  

  public int hashCode() {  
    int hash = value.hashCode();  
    System.out.println("hashcode called" + hash);  
    return hash;  
  }  

  public static void main(String args[]) {  

    java.util.Set s = new java.util.HashSet();  

    s.add(new UsingSet("A"));  
    s.add(new UsingSet("b"));  
    s.add(new UsingSet("a"));  
    s.add(new UsingSet("b"));   
    s.add(new UsingSet("a"));  

    s.add(new Integer(1));  
    s.add(new Integer(1));  

    System.out.println("s = " + s); 

  }  
}  

output is:

输出是:

hashcode called65
hashcode called98
hashcode called97
hashcode called98
hashcode called97
s = [1, b, b, A, a, a]

回答by searchengine27

James Large answer is incorrect, or rather misleading (and part incorrect as well). I will explain.

James Large 的回答是不正确的,或者说是误导(部分也不正确)。我会解释。

If two objects are equal according to their equals() method, they must also have the same hash code. If two objects have the same hash code, they do NOT have to be equal too.

如果两个对象根据它们的 equals() 方法相等,则它们也必须具有相同的哈希码。如果两个对象具有相同的哈希码,则它们也不必相等。

Here is the actual wording from the java.util.Object documentation:

这是 java.util.Object 文档中的实际措辞:

  • If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
  • It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.
  • 如果根据 equals(Object) 方法两个对象相等,则对两个对象中的每一个调用 hashCode 方法必须产生相同的整数结果。
  • 如果根据 equals(java.lang.Object) 方法两个对象不相等,则不需要对两个对象中的每一个调用 hashCode 方法必须产生不同的整数结果。但是,程序员应该意识到为不相等的对象生成不同的整数结果可能会提高哈希表的性能。

It is true, that if two objects don't have the same hash then they are not equal. However, hashing is not a way to check equality - so it is wildly incorrect to say that it is a faster way to check equality.

确实,如果两个对象没有相同的哈希值,那么它们就不相等。然而,散列不是一种检查相等性的方法——所以说它是一种检查相等性的更快方法是非常不正确的。

Also, it is also wildly incorrect to say the hashCode function is an efficient way to do anything. This is all up to implementation, but the default implementation for hashCode of a string is very inefficient as the String gets large. It will perform a calculation based on each char of the String, so if you are using large Strings as keys, then this becomes very inefficient; moreso if you have a large number of buckets.

此外,说 hashCode 函数是做任何事情的有效方式也是非常不正确的。这完全取决于实现,但是随着字符串变大,字符串的 hashCode 的默认实现非常低效。它将根据字符串的每个字符执行计算,因此如果您使用大字符串作为键,那么这将变得非常低效;如果您有大量存储桶,则更是如此。

In a Map (HashSet uses a HashMap internally), there are buckets and in each bucket is a linked list. Java uses the hashCode() function to find out which bucket it belongs in (it actually will modify the hash, depending on how many buckets exist). Since two objects may share the same hash, it will iterate through the linked list sequentially next, checking the equals() method to see if the object is a duplicate. Per the java.util.Set documenation:

在 Map(HashSet 内部使用 HashMap)中,有多个桶,每个桶中有一个链表。Java 使用 hashCode() 函数找出它属于哪个桶(它实际上会修改散列,取决于存在多少个桶)。由于两个对象可能共享相同的哈希值,因此接下来将依次遍历链表,检查 equals() 方法以查看对象是否重复。根据 java.util.Set 文档:

A collection that contains no duplicate elements.

不包含重复元素的集合。

So, if its hashCode() leads it to a bucket, in which that bucket contains an Object where the .equals() evaluates to true, then the previous Object is overwritten with the new Object. You can probably view here for more information: How does a Java HashMap handle different objects with the same hash code?

因此,如果它的 hashCode() 将它引导到一个存储桶,其中该存储桶包含一个对象,其中 .equals() 的计算结果为 true,那么前一个对象将被新对象覆盖。您可以在此处查看更多信息: Java HashMap 如何处理具有相同哈希码的不同对象?

Generally speaking though, it is good practice that if you overwrite the hashCode function, you also overwrite the equals function (if I'm not mistaken, this breaks the contract if you choose not to).

不过,一般来说,如果你覆盖 hashCode 函数,你也会覆盖 equals 函数(如果我没记错的话,如果你选择不这样做,这违反了合同)是一种很好的做法。

回答by Yingjie Tang

HashCode & Equals methods

HashCode 和 Equals 方法

  1. Only Override HashCode, Use the default Equals:Only the references to the same object will return true. In other words, those objects you expected to be equal will not be equal by calling the equals method.
  2. Only Override Equals, Use the default HashCode:There might be duplicates in the HashMap or HashSet. We write the equals method and expect{"abc", "ABC"}to be equals. However, when using a HashMap, they might appear in different buckets, thus the contains()method will not detect them each other.
  1. Only Override HashCode, Use the default Equals:只有对同一个对象的引用才会返回true。换句话说,通过调用 equals 方法,您期望相等的那些对象将不相等。
  2. 仅覆盖等于,使用默认的 HashCode:HashMap 或 HashSet 中可能存在重复项。我们编写了equals方法并期望{"abc", "ABC"}是equals。但是,在使用 HashMap 时,它们可能会出现在不同的桶中,因此该contains()方法不会相互检测它们。

回答by Solomon Slow

Without looking at your code...

不看你的代码...

The whole point of hash codes is to speed up the process of testing two objects for equality. It can be costly to test whether two large, complex objects are equal, but it is trivially easy to compare their hash codes, and hash codes can be pre-computed.

哈希码的全部意义在于加快测试两个对象是否相等的过程。测试两个大而复杂的对象是否相等可能成本很高,但比较它们的哈希码非常容易,并且可以预先计算哈希码。

The rule is: If two objects don't have the same hash code, that meansthey are not equal. No need to do the expensive equality test.

规则是:如果两个对象没有相同的哈希码,则意味着它们不相等。无需进行昂贵的平等测试。

So, the answer to the question in your title: If you define an equals() method that says object A is equal to object B, and you define a hashCode() method that says object A is notequal to object B (i.e., it says they have different hash codes), and then you hand those two objects to some library that cares whether they are equal or not (e.g., if you put them in a hash table), then the behavior of the library is going to be undefined (i.e., probably wrong).

因此,标题中问题的答案是:如果您定义了一个 equals() 方法,表示对象 A 等于对象 B,并且您定义了一个 hashCode() 方法,表示对象 A等于对象 B(即,它说它们有不同的哈希码),然后你把这两个对象交给某个关心它们是否相等的库(例如,如果你把它们放在一个哈希表中),那么库的行为将是未定义(即,可能是错误的)。



Added information: Wow! I really missed seeing the forest for the trees here---thinking about the purpose of hashCode() without putting it in the context of HashMap. If m is a Map with N entries, and k is a key; what is the purpose of calling m.get(k)? The purpose, obviously, is to search the map for an entry whose key is equalto k.

补充资料:哇!我真的很想念这里只见树木不见森林——思考 hashCode() 的目的,而没有把它放在 HashMap 的上下文中。如果 m 是一个有 N 个条目的 Map,而 k 是一个键;打电话的目的是m.get(k)什么?显然,目的是在映射中搜索键等于k的条目。

What if hash codes and hash maps had not been invented? Well the best you could do, assuming that the keys have a natural, total order, is to search a TreeMap, comparing the given key for equality with O(log(N)) other keys. In the worst case, where the keys have no order, you would have to compare the given key for equality with every key in the map until you either find a match or tested them all. In other words, the complexity of m.get(k)would be O(N).

如果没有发明散列码和散列图会怎样?假设键具有自然的总顺序,您可以做的最好的事情就是搜索 TreeMap,将给定的键与 O(log(N)) 其他键的相等性进行比较。在最坏的情况下,键没有顺序,您必须将给定的键与映射中的每个键进行相等性比较,直到找到匹配项或对它们全部进行测试。换句话说, 的复杂度m.get(k)将是 O(N)。

When m is a HashMap, the complexity of m.get(k)is O(1), whether the keys can be ordered or not.

当 m 是一个 HashMap 时,其复杂度m.get(k)是 O(1),键是否可以排序。

So, I messed up by saying that the point of hash codes was to speed up the process of testing twoobjects for equality. It's really about testing an object for equality with a whole collectionof other objects. That's where comparing hash codes doesn't just help a little; It helps by orders of magnitude...

所以,我说散列码的目的是加快测试两个对象是否相等的过程,我搞砸了。这实际上是关于测试一个对象与其他对象的整个集合是否相等。这就是比较哈希码不仅有一点帮助的地方;它有几个数量级的帮助......

...Ifthe k.hashCode()and k.equals(o)methods obey the rule: j.hashCode()!=k.hashCode()implies !j.equals(k).

...如果k.hashCode()k.equals(o)方法服从规则: j.hashCode()!=k.hashCode()暗示!j.equals(k)

回答by Siva Kumar

Set will behave differently.

Set 的行为会有所不同。

Uniqueness wont happen. Because unique will be achieved by both hashcode and equals methods.
output will be liked this s = [A, a, b, 1] instead of early one.

独特性不会发生。因为唯一性将通过 hashcode 和 equals 方法实现。
输出会喜欢这个 s = [A, a, b, 1] 而不是早期的。

Apart that remove and contains all wont work.

除了删除和包含所有内容都行不通。

回答by Hasnain Ali Bohra

Simply you can Assume hashcode and equals methods as a 2D search like:-

简单地,您可以将 hashcode 和 equals 方法假设为 2D 搜索,例如:-

Where Hashcode is the Rows and the object list is the Column. Consider the following class structure.

其中哈希码是行,对象列表是列。考虑以下类结构。

public class obj
  {
    int Id;
    String name;
    public obj(String name,int id)
     {
         this.id=id;
         this.name=name;
     }
  }

now if you create the objects like this:-

现在,如果您创建这样的对象:-

obj obj1=new obj("Hassu",1);
obj obj2=new obj("Hoor",2);
obj obj3=new obj("Heniel",3);
obj obj4=new obj("Hameed",4);
obj obj5=new obj("Hassu",1);

and you place this objects in map like this :-

你把这个对象放在地图上是这样的:-

    HashMap hMap=new HashMap();
   1. hMap.put(obj1,"value1");
   2. hMap.put(obj2,"value2");
   3. hMap.put(obj3,"value3");
   4. hMap.put(obj4,"value4");
   5. hMap.put(obj5,"value5");

now if you have not override the hashcode and equals then after putting all the objects till line 5 if you put obj5 in the map as By Default HashCode you get different hashCode so the row(Bucket will be different). So in runtime memory it will be stored like this.

现在,如果您没有覆盖哈希码并等于,那么在将所有对象放置到第 5 行之后,如果您将 obj5 作为默认哈希码放在地图中,您将获得不同的哈希码,因此行(存储桶将不同)。所以在运行时内存中,它将像这样存储。

|hashcode   | Objects   
|-----------| --------- 
|000562     | obj1 
|000552     | obj2 
|000588     | obj3
|000546     | obj4
|000501     | obj5

Now if you create the same object Like :- obj obj6 = new obj("hassu",1); And if you search for this value in the map.like

现在,如果您创建相同的对象,例如:- obj obj6 = new obj("hassu",1); 如果你在 map.like 中搜索这个值

if(hMap.conaints(obj6)) 
or 
hMpa.get(obj 6);

though the key(obj1) with the same content is available you will get false and null respectively. Now if you override only equals method. and perform the same content search key will also get the Null as the HashCode for obj6 is different and in that hashcode you wont find any key. Now if you override only hashCode method.

尽管具有相同内容的 key(obj1) 可用,但您将分别获得 false 和 null。现在,如果您只覆盖 equals 方法。并执行相同的内容搜索键也将获得 Null,因为 obj6 的 HashCode 不同,并且在该哈希码中您找不到任何键。现在,如果您只覆盖 hashCode 方法。

You will get the same bucket (HashCode row) but the content cant be checked and it will take the reference checked implementation by Super Object Class. SO here if you search for the key hMap.get(obj6) you will get the correct hashcode:- 000562 but as the reference for both obj1 and obj6 is different you will get null.

您将获得相同的存储桶(HashCode 行),但无法检查内容,它将采用 Super Object Class 的引用检查实现。所以在这里,如果您搜索键 hMap.get(obj6) 您将获得正确的哈希码:- 000562 但由于 obj1 和 obj6 的引用不同,您将获得空值。