java中hashmap数据结构的实现
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/22215353/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Implementation of hashmap data structure in java
提问by astack
I am having one technical test , problem statement is given below, i am not getting what exactly i have to do, please help me for the same by providing some sample code.
我正在进行一项技术测试,问题陈述如下,我没有得到我必须做的事情,请通过提供一些示例代码来帮助我。
Implement a hash-map data structure from scratch , where both key and value are of string data type.
The details of how a hash works can be found in Chapter 11 of the book "Introduction to
Algorithms" by Cormen, Leiserson, Rivest. You should also refer to Section 3.7 of "The
Algorithm Design Manual" by Steven Skiena. For the required hash-map implementation,
the following conditions hold true:
1. The key is made up of lower-case english alphabets only (a,b,c...z). It can be of any
length.
2. Values are of string data type.
3. The hash function to be used is the one given in Section 3.7 of Skiena.
4. Choose a suitable size of the hash-map, that is, the number of buckets. It should be
greater than 100.
4. Collisions will be resolved using Chaining. Doubly linked lists will be used to store
colliding entries.
You will have to implement the following operations on the hash-map:
a. Create an empty hash-map.
b. Insert a (key, value) pair into the hash-map.
c. Delete a (key) from the hash-map (if present).
d. Search for a (key) in the hash-map, and if present return its value. Else return null.
Thanks in advance.
回答by Solace
I believe that it's going to be more helpful if I explain HashMaps in English.
我相信如果我用英语解释 HashMaps 会更有帮助。
What is a HashMap?
什么是哈希映射?
A HashMap is a data structure that is able to map certain keys to certain values. The keys and values could be anything. For example, if I were making a game, I might link every username to a friends list, represented by a List of Strings.
HashMap 是一种能够将某些键映射到某些值的数据结构。键和值可以是任何东西。例如,如果我正在制作一个游戏,我可能会将每个用户名链接到一个朋友列表,由一个字符串列表表示。
Why use a HashMap?
为什么要使用 HashMap?
HashMaps are much faster for retreiving data than arrays and linked lists. A sorted array could find a particular value in O(log n) with binary search. However, a HashMap can check if it contains a particular key in O(1). All keys must be unique.
HashMap 在检索数据方面比数组和链表快得多。排序的数组可以通过二进制搜索在 O(log n) 中找到特定值。但是,HashMap 可以检查它是否包含 O(1) 中的特定键。所有键都必须是唯一的。
How do HashMaps work?
HashMap 是如何工作的?
HashMaps use an array in the background. Each element in the array is another data structure (usually a linked list or binary search tree). The HashMap uses a function on the key to determine where to place the key's value in the array. For example, if my HashMap accepts Strings...possible hash functions can be:
HashMaps 在后台使用一个数组。数组中的每个元素都是另一种数据结构(通常是链表或二叉搜索树)。HashMap 使用键上的函数来确定将键的值放置在数组中的位置。例如,如果我的 HashMap 接受字符串......可能的哈希函数可以是:
A. Return the ASCII value of the first letter.
B. Return the sum of the ASCII values of every character in the String.
C. Return the ASCII value of the last character in the String.
The value returned will determine the index in which the value goes into the array.
返回的值将决定该值进入数组的索引。
But Wait! There's a Problem!
可是等等!有问题!
It is possible that the returned value will be outside of the array's bounds. Therefore, we are supposed to mod the returned value by the arrays length.
返回值可能会超出数组的边界。因此,我们应该通过数组长度对返回值进行修改。
return Math.abs(number%hashMapArray.length);
Collisions:
碰撞:
Isn't it possible that multiple keys will make the hash function generate the same index? Yes. For example, if we used the first hash function shown above in a hash map of strings...any two Strings that start with the same letter will be given the same array index.
是不是有可能多个键会使散列函数生成相同的索引?是的。例如,如果我们在字符串的哈希映射中使用上面显示的第一个哈希函数……以相同字母开头的任何两个字符串都将被赋予相同的数组索引。
This is called a collision.
这称为碰撞。
How do we handle collisions?
我们如何处理碰撞?
One collision handling technique is called Chaining. Since every element in the array is a linked list (or similar data structure), multiple keys that have the same hash value will be placed in the same linked list or "bucket". Afterwards, the hash map is able to retrieve values by calculating the hash code with the hash function, and searching the particular linked list to see if it contains a value with the same key.
一种碰撞处理技术称为链接。由于数组中的每个元素都是一个链表(或类似的数据结构),具有相同哈希值的多个键将被放置在同一个链表或“桶”中。之后,哈希映射能够通过使用哈希函数计算哈希码并搜索特定链表以查看它是否包含具有相同键的值来检索值。
A good hash function must be written to avoid collisions.
必须编写一个好的散列函数以避免冲突。
Advantages to Chaining:
链式的优点:
-Array cannot overflow
-数组不能溢出
-Data can be easily removed
- 数据可以轻松删除
Disadvantages to Chaining:
链式的缺点:
-May suffer a performance hit if buckets contain very long linked lists.
- 如果桶包含很长的链表,可能会影响性能。
The total number of entries to the number of buckets is called the load factor. If the load factor is too low, a lot of space is wasted. If the load factor is too high, the advantage of hashing is lost. A good compromise on load factor is .75
条目总数与桶数之比称为负载因子。如果负载因子太低,则会浪费大量空间。如果负载因子太高,散列的优势就失去了。负载系数的一个很好的折衷是 0.75