java 从头开始创建哈希表?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7919532/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 22:01:30  来源:igfitidea点击:

Creating a Hashtable from scratch?

javahashtable

提问by Bill

I am writing a program that will hash words from a document along with their frequency of use and line numbers. I thought that I had finished it when I was told that you have to create a hash table from scratch. I do not know where to begin. Any suggestions as to where and how to start would be appreciated.

我正在编写一个程序,它将对文档中的单词及其使用频率和行号进行哈希处理。当我被告知必须从头开始创建哈希表时,我以为我已经完成了。我不知道从哪里开始。任何关于从哪里和如何开始的建议将不胜感激。

回答by Hyman Edmonds

A hash table is an important and fundamental data structure. You can read more about it at Wikipedia's Hash table article. Fortunately, they are fairly easy to implement.

哈希表是一种重要且基本的数据结构。您可以在Wikipedia 的哈希表文章 中阅读有关它的更多信息。幸运的是,它们很容易实现。

Basically a hash table is a data structure that takes a key and returns or stores a value for that key.

哈希表基本上是一种数据结构,它接受一个键并返回或存储该键的值。

At their core, they are usually implemented using an array, which we shall call arrsuch that key-value pairs are stored at arr[key.hashCode()%arr.length]. Notice that since your array is not of infinite length, and that hashCodeis not guaranteed to produce unique values you will eventually end up with keys that map to the same index of the array. This is called a collision.

在它们的核心中,它们通常使用一个数组来实现,我们将称其arr为键值对存储在arr[key.hashCode()%arr.length]。请注意,由于您的数组不是无限长的,并且hashCode不能保证生成唯一值,因此您最终会得到映射到数组相同索引的键。这称为碰撞

One way of resolving these collisions is to store a linked list for each member of arr. Then the definition of arrwould look like this

解决这些冲突的一种方法是为 的每个成员存储一个链表arr。那么的定义arr看起来像这样

LinkedList<Object> arr[];

All objects which map to arr[key.hashCode()%arr.length]will be added to the list at that position. When you want to retrieve an object from the hashtable, jump to the linked list found at arr[key.hashCode()%arr.length]and iterate through each member of the linked list until you find a key-value pair where the keys are .equals.

映射到的所有对象arr[key.hashCode()%arr.length]都将添加到该位置的列表中。当你想从哈希表中检索一个对象时,跳转到找到的链表arr[key.hashCode()%arr.length]并遍历链表的每个成员,直到找到键为 的键值对.equals

A good hash table implementation might do things like resize arronce it gets too full.

一个好的哈希表实现可能会在arr它变得太满时执行诸如调整大小之类的事情。

回答by ewok

To create a hashtable, You need to keep in mind the structure of what it is that you want to build. A hashtable has 2 elements, a key set and a value set. Think about how you would represent these and how you would make sure that the keys always map to the appropriate value.

要创建哈希表,您需要记住要构建的结构。哈希表有 2 个元素,一个键集和一个值集。考虑如何表示这些以及如何确保键始终映射到适当的值。

Next, think about your hash function. Remember that There is no "correct" hash function. You just need to come up with one that works well for you.

接下来,考虑您的哈希函数。请记住,没有“正确”的哈希函数。你只需要想出一个适合你的。

Once these two things are taken care of, the rest is simple. Writing methods to work on the two sets are basically just a matter of cleverly using the hash function.

这两件事搞定了,剩下的就简单了。编写在这两个集合上工作的方法基本上只是巧妙地使用哈希函数的问题。

Good Luck!

祝你好运!

Extra hint: Remember that the value set of the hashtable is not necessarily the same size as the key set. Remember the concept of buckets. That is very important.

额外提示:请记住,哈希表的值集不一定与键集的大小相同。记住桶的概念。这是非常重要的。

回答by Steve J

Start with studying the HashMapJava class. Notice that it is an extension of the AbstractMap, and that AbstractMap provides a lot of basic capabilities you want. Not all of them, however. The most important method, put(), always throws UnsupportedOperationException. That is by design. To make it a MyHashMap, you need to override this method. Inside of put() will be where you will do all your hashy stuff referred to in Hyman's great answer.

从学习HashMapJava 类开始。请注意,它是AbstractMap的扩展,并且 AbstractMap 提供了许多您想要的基本功能。然而,并非全部。最重要的方法 put() 总是抛出 UnsupportedOperationException。那是设计使然。要使其成为 MyHashMap,您需要覆盖此方法。在 put() 内部,您将执行Hyman's great answer 中提到的所有杂乱无章的事情。

public class MyHashMap<K,V> extends AbstractMap<K,V> implements Map<K,V> {

  // your implementation goes in here, including the override of put()

}

By doing it this way, you can test your implementation more easily. Run a series of tests in which you provide a HashMap as a parameter. Then run the same code with an object of your MyHashMap. Since both are implementations of Map, you can substitute one for the other without changing any code, and you'll be confident about what your tests are telling you.

通过这样做,您可以更轻松地测试您的实现。运行一系列测试,在其中提供 HashMap 作为参数。然后使用 MyHashMap 的对象运行相同的代码。由于两者都是 Map 的实现,您可以在不更改任何代码的情况下替换另一个,并且您将对测试告诉您的内容充满信心。

回答by blackcompe

Just use an array for your table. You'll need a hash function, and a load factor. The load factor will determine when you resize the table to keep it efficient. A simple hash function for a string is to sum up each character's ascii value multiplied by 2^(character index) and mod that sum by the table length. e.g.

只需为您的表使用一个数组。您将需要一个哈希函数和一个负载因子。负载因子将决定您何时调整表的大小以保持其高效。一个简单的字符串散列函数是将每个字符的 ascii 值乘以 2^(字符索引)和 mod 相加,然后再乘以表长度。例如

static int hash(String s) {
    int len = 30;
    int sum = 0;
    for(int i = 0; i < s.length(); i++) {
        sum += ((int)s.charAt(i)) * (1<<i);
    }
    return sum%len;
}

You could use String.hashCodetoo. To resolve collisions the simplest method to use is linear probing, but clustering will occur quickly. You can implement double hashing for better performance.

你也可以用String.hashCode。要解决冲突,最简单的方法是线性探测,但聚类会很快发生。您可以实现双重散列以获得更好的性能。