java Java是如何实现哈希表的?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1647221/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How does Java implement hash tables?
提问by escalon
Does anyone know how Java implements its hash tables (HashSet or HashMap)? Given the various types of objects that one may want to put in a hash table, it seems very difficult to come up with a hash function that would work well for all cases.
有谁知道 Java 如何实现其哈希表(HashSet 或 HashMap)?考虑到人们可能想要放入哈希表的各种类型的对象,想出一个适用于所有情况的哈希函数似乎非常困难。
回答by sinuhepop
HashMap and HashSet are very similar. In fact, the second contains an instance of the first.
HashMap 和 HashSet 非常相似。事实上,第二个包含第一个的实例。
A HashMap contains an array of buckets in order to contain its entries. Array size is always powers of 2. If you don't specify another value, initially there are 16buckets.
HashMap 包含一个桶数组以包含其条目。数组大小始终是 2 的幂。如果您不指定其他值,最初有16 个存储桶。
When you put an entry (key and value) in it, it decides the bucket where the entry will be inserted calculating it from its key's hashcode (hashcode is not its memory address, and the the hash is not a modulus). Different entries can collide in the same bucket, so they'll be put in a list.
当您将条目(键和值)放入其中时,它会根据键的哈希码(哈希码不是其内存地址,并且哈希不是模数)来确定将插入条目的存储桶。不同的条目可以在同一个存储桶中发生冲突,因此它们将被放入一个列表中。
Entries will be inserted until they reach the load factor. This factor is 0.75by default, and is not recommended to change it if you are not very sure of what you're doing. 0.75 as load factor means that a HashMap of 16buckets can only contain 12entries (16*0.75). Then, an array of buckets will be created, doubling the size of the previous. All entries will be put again in the new array. This process is known as rehashing, and can be expensive.
条目将被插入,直到它们达到负载因子。默认情况下,此系数为0.75,如果您不太确定自己在做什么,则不建议更改它。0.75 作为负载因子意味着16 个桶的 HashMap只能包含12 个条目(16*0.75)。然后,将创建一个桶数组,将前一个桶的大小加倍。所有条目将再次放入新数组中。这个过程被称为rehashing,并且可能很昂贵。
Therefore, a best practice, if you know how many entries will be inserted, is to construct a HashMap specifying its final size:
因此,如果您知道将插入多少条目,最佳实践是构造一个指定其最终大小的 HashMap:
new HashMap(finalSize);
回答by Jim Garrison
Java depends on each class' implementation of the hashCode() method to distribute the objects evenly. Obviously, a bad hashCode() method will result in performance problems for large hash tables. If a class does not provide a hashCode() method, the default in the current implementation is to return some function (i.e. a hash) of the the object's address in memory. Quoting from the API doc:
Java 依赖于每个类的 hashCode() 方法的实现来均匀分布对象。显然,一个糟糕的 hashCode() 方法会导致大型哈希表的性能问题。如果一个类不提供 hashCode() 方法,当前实现中的默认值是返回内存中对象地址的某个函数(即散列)。引用 API 文档:
As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)
就合理实用而言,类 Object 定义的 hashCode 方法确实为不同的对象返回不同的整数。(这通常是通过将对象的内部地址转换为整数来实现的,但是 JavaTM 编程语言不需要这种实现技术。)
回答by brianegge
There are two general ways to implement a HashMap. The difference is how one deals with collisions.
有两种实现 HashMap 的通用方法。不同之处在于处理冲突的方式。
The first method, which is the one Java users, makes every bucket in a the HashMap contain a singly linked list. To accomplish this, each bucket contains an Entrytype, which caches the hashCode, has a pointer to the key, pointer to the value, and a pointer to the next entry. When a collision occurs in Java, another entry is added to the list.
第一种方法是 Java 用户,它使 HashMap 中的每个桶都包含一个单向链表。为了实现这一点,每个存储桶都包含一个Entry类型,它缓存 hashCode,有一个指向键的指针、指向值的指针和指向下一个条目的指针。当 Java 中发生冲突时,另一个条目会添加到列表中。
The other method for handling collisions, is to simply put the item into the next empty bucket. The advantage of this method is it requires less space, however, it complicates removals, as if the bucket following the removed item is not empty, one has to check to see if that item is in the right or wrong bucket, and shift the item if it originally has collided with the item being removed.
处理碰撞的另一种方法是简单地将项目放入下一个空桶中。这种方法的优点是占用空间小,但是移除起来比较复杂,就好像被移除项后面的桶不是空的,必须检查该项目是在正确的还是错误的桶中,然后移动该项目如果它最初与被移除的物品发生碰撞。
回答by OscarRyz
In my own words:
用我自己的话:
An Entryobject is created to hold the reference of the Key and Value.
Entry创建一个对象来保存键和值的引用。
The HashMap has an array of Entry's.
HashMap 有一个Entry's数组。
The index for the given entry is the hash returned by key.hashCode()
给定条目的索引是由返回的哈希 key.hashCode()
If there is a collision ( two keys gave the same index ) , the entry is stored in the .nextattribute of the existing entry.
如果发生冲突(两个键给出了相同的索引),则条目存储在.next现有条目的属性中。
That's how two objects with the same hash could be stored into the collection.
这就是具有相同散列的两个对象可以存储到集合中的方式。
From thisanswer we get:
从这个答案我们得到:
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry<K,V> e = table[indexFor(hash, table.length)];
e != null;
e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
Let me know if I got something wrong.
如果我有什么问题,请告诉我。

