C/C++中的哈希表实现
C/C++(关联数组)中的哈希表是一种将键映射到值的数据结构。
这使用哈希函数来计算键的索引。
基于哈希表索引,我们可以将值存储在适当的位置。
如果两个不同的键获得相同的索引,则需要使用其他数据结构(存储桶)来解决这些冲突。
使用哈希表的全部好处在于访问时间非常快。
虽然可能会发生冲突,但是如果我们选择一个很好的哈希函数,则该机会几乎为零。
因此,平均而言,时间复杂度是常数O(1)访问时间。
这称为摊销时间复杂度。
C++ STL(标准模板库)具有" std :: unordered_map()"数据结构,该数据结构实现了所有这些哈希表功能。
但是,了解如何从头开始构建哈希表是一项至关重要的技能,而这正是我们旨在向您展示的内容。
任何哈希表实现都具有以下三个组成部分:
良好的哈希函数,可将键映射到值
支持插入,搜索和删除操作的哈希表数据结构。
解释键冲突的数据结构
选择哈希函数
第一步是选择冲突可能性较低的合理的哈希函数。
但是,由于这只是为了说明,所以我会做相反的事情!逆向心理学,是吗?
在本文中,我们将仅处理字符串(或者C中的字符数组)。
我将使用一个非常简单的哈希函数,该函数简单地将字符串的ASCII值求和。
我正在使用它来向您展示我们将如何处理冲突案例。
#define CAPACITY 50000 //Size of the Hash Table
unsigned long hash_function(char* str) {
unsigned long i = 0;
for (int j=0; str[j]; j++)
i += str[j];
return i % CAPACITY;
}
您可以针对不同的字符串进行测试,并检查它们是否冲突。
例如,字符串" Hel"和" Cau"将冲突,因为它们具有相同的ASCII值。
注意:我们必须返回哈希表容量之内的数字。
否则,我们可能会访问未绑定的内存位置,从而导致错误。
定义哈希表数据结构
哈希表是一组项目,它们本身是{key:value}对。
让我们先定义项目结构。
typedef struct Ht_item Ht_item;
//Define the Hash Table Item here
struct Ht_item {
char* key;
char* value;
};
现在,哈希表具有一个指向" Ht_item"的指针数组,因此它是一个双指针。
除此之外,我们还将使用count来跟踪Hash表中的元素数量,并将表的大小存储在size中。
typedef struct HashTable HashTable;
//Define the Hash Table here
struct HashTable {
//Contains an array of pointers
//to items
Ht_item** items;
int size;
int count;
};
创建哈希表及其项
我们需要函数来在内存中创建一个新的哈希表,并创建其项。
首先创建一个项目。
这非常简单,因为我们只需要为其键和值分配内存并返回指向该项的指针。
Ht_item* create_item(char* key, char* value) {
//Creates a pointer to a new hash table item
Ht_item* item = (Ht_item*) malloc (sizeof(Ht_item));
item->key = (char*) malloc (strlen(key) + 1);
item->value = (char*) malloc (strlen(value) + 1);
strcpy(item->key, key);
strcpy(item->value, value);
return item;
}
现在,让我们编写用于创建表格的代码。
这将为包装器结构" HashTable"分配内存,并将其所有项目设置为" NULL"(因为未使用它们)。
HashTable* create_table(int size) {
//Creates a new HashTable
HashTable* table = (HashTable*) malloc (sizeof(HashTable));
table->size = size;
table->count = 0;
table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*));
for (int i=0; i<table->size; i++)
table->items[i] = NULL;
return table;
}
现在,我们几乎完成了这部分。
作为C/C++程序员,您有责任使用malloc(),calloc()释放已在堆上分配的内存。
因此,让我们编写一些函数,这些函数可以释放表项以及整个表。
void free_item(Ht_item* item) {
//Frees an item
free(item->key);
free(item->value);
free(item);
}
void free_table(HashTable* table) {
//Frees the table
for (int i=0; i<table->size; i++) {
Ht_item* item = table->items[i];
if (item != NULL)
free_item(item);
}
free(table->items);
free(table);
}
现在,我们已经完成了构建功能哈希表的基础工作。
现在让我们开始编写我们的insert(),search()和delete()方法。
插入哈希表
我们将创建一个函数ht_insert()来为我们执行此任务。
这需要一个"哈希表"指针,一个"键"和一个"值"作为参数。
void ht_insert(HashTable* table, char* key, char* value);
现在,插入功能涉及某些步骤。
- 根据{key:value}对创建项目
- 基于哈希函数计算索引
- 通过比较键检查索引是否已被占用,如果没有被占用。
我们可以直接将其插入index - 否则,这是一次碰撞,我们需要处理它
我将说明创建初始模型后如何处理碰撞,因此让我们稍等一下。
第一步很简单。
我们直接调用create_item(key,value)。
int index = hash_function(key);
第二步和第三步使用" hash_function(key)"来获取索引。
如果是第一次插入密钥,则该项必须为" NULL"。
否则,确切的key:value对已经存在,或者是冲突。
在这种情况下,我们将定义另一个函数handle_collision(),顾名思义,它将为我们处理这种潜在的碰撞。
//Create the item
Ht_item* item = create_item(key, value);
//Compute the index
int index = hash_function(key);
Ht_item* current_item = table->items[index];
if (current_item == NULL) {
//Key does not exist.
if (table->count == table->size) {
//Hash Table Full
printf("Insert Error: Hash Table is full\n");
free_item(item);
return;
}
//Insert directly
table->items[index] = item;
table->count++;
}
让我们考虑第一种情况,其中存在key:value对(即,之前已插入相同的项目)。
在这种情况下,我们必须仅将项目值更新为新值。
if (current_item == NULL) {
....
....
}
else {
//Scenario 1: We only need to update value
if (strcmp(current_item->key, key) == 0) {
strcpy(table->items[index]->value, value);
return;
}
else {
//Scenario 2: Collision
//We will handle case this a bit later
handle_collision(table, item);
return;
}
}
好的,所以我们的插入函数(没有冲突)现在看起来像这样:
void handle_collision(HashTable* table, Ht_item* item) {
}
void ht_insert(HashTable* table, char* key, char* value) {
//Create the item
Ht_item* item = create_item(key, value);
Ht_item* current_item = table->items[index];
if (current_item == NULL) {
//Key does not exist.
if (table->count == table->size) {
//Hash Table Full
printf("Insert Error: Hash Table is full\n");
return;
}
//Insert directly
table->items[index] = item;
table->count++;
}
else {
//Scenario 1: We only need to update value
if (strcmp(current_item->key, key) == 0) {
strcpy(table->items[index]->value, value);
return;
}
else {
//Scenario 2: Collision
//We will handle case this a bit later
handle_collision(table, item);
return;
}
}
}
在哈希表中搜索项目
如果要检查插入是否正确完成,则还需要定义一个搜索功能,该功能检查键是否存在,如果存在则返回相应的值。
char* ht_search(HastTable* table, char* key);
逻辑很简单。
它只是移至非NULL项并比较键。
否则,我们将返回NULL
char* ht_search(HashTable* table, char* key) {
//Searches the key in the hashtable
//and returns NULL if it doesn't exist
int index = hash_function(key);
Ht_item* item = table->items[index];
//Ensure that we move to a non NULL item
if (item != NULL) {
if (strcmp(item->key, key) == 0)
return item->value;
}
return NULL;
}
测试我们的基本模型
让我们通过编写一个" main()"驱动程序来检查我们编写的内容是否正确。
我们将添加另一个函数print_table()来打印哈希表,以供说明。
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define CAPACITY 50000 //Size of the Hash Table
unsigned long hash_function(char* str) {
unsigned long i = 0;
for (int j=0; str[j]; j++)
i += str[j];
return i % CAPACITY;
}
typedef struct Ht_item Ht_item;
//Define the Hash Table Item here
struct Ht_item {
char* key;
char* value;
};
typedef struct HashTable HashTable;
//Define the Hash Table here
struct HashTable {
//Contains an array of pointers
//to items
Ht_item** items;
int size;
int count;
};
Ht_item* create_item(char* key, char* value) {
//Creates a pointer to a new hash table item
Ht_item* item = (Ht_item*) malloc (sizeof(Ht_item));
item->key = (char*) malloc (strlen(key) + 1);
item->value = (char*) malloc (strlen(value) + 1);
strcpy(item->key, key);
strcpy(item->value, value);
return item;
}
HashTable* create_table(int size) {
//Creates a new HashTable
HashTable* table = (HashTable*) malloc (sizeof(HashTable));
table->size = size;
table->count = 0;
table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*));
for (int i=0; i<table->size; i++)
table->items[i] = NULL;
return table;
}
void free_item(Ht_item* item) {
//Frees an item
free(item->key);
free(item->value);
free(item);
}
void free_table(HashTable* table) {
//Frees the table
for (int i=0; i<table->size; i++) {
Ht_item* item = table->items[i];
if (item != NULL)
free_item(item);
}
free(table->items);
free(table);
}
void handle_collision(HashTable* table, unsigned long index, Ht_item* item) {
}
void ht_insert(HashTable* table, char* key, char* value) {
//Create the item
Ht_item* item = create_item(key, value);
//Compute the index
unsigned long index = hash_function(key);
Ht_item* current_item = table->items[index];
if (current_item == NULL) {
//Key does not exist.
if (table->count == table->size) {
//Hash Table Full
printf("Insert Error: Hash Table is full\n");
//Remove the create item
free_item(item);
return;
}
//Insert directly
table->items[index] = item;
table->count++;
}
else {
//Scenario 1: We only need to update value
if (strcmp(current_item->key, key) == 0) {
strcpy(table->items[index]->value, value);
return;
}
else {
//Scenario 2: Collision
//We will handle case this a bit later
handle_collision(table, index, item);
return;
}
}
}
char* ht_search(HashTable* table, char* key) {
//Searches the key in the hashtable
//and returns NULL if it doesn't exist
int index = hash_function(key);
Ht_item* item = table->items[index];
//Ensure that we move to a non NULL item
if (item != NULL) {
if (strcmp(item->key, key) == 0)
return item->value;
}
return NULL;
}
void print_search(HashTable* table, char* key) {
char* val;
if ((val = ht_search(table, key)) == NULL) {
printf("Key:%s does not exist\n", key);
return;
}
else {
printf("Key:%s, Value:%s\n", key, val);
}
}
void print_table(HashTable* table) {
printf("\nHash Table\n-------------------\n");
for (int i=0; i<table->size; i++) {
if (table->items[i]) {
printf("Index:%d, Key:%s, Value:%s\n", i, table->items[i]->key, table->items[i]->value);
}
}
printf("-------------------\n\n");
}
int main() {
HashTable* ht = create_table(CAPACITY);
ht_insert(ht, "1", "First address");
ht_insert(ht, "2", "Second address");
print_search(ht, "1");
print_search(ht, "2");
print_search(ht, "3");
print_table(ht);
free_table(ht);
return 0;
}
输出
Key:1, Value:First address Key:2, Value:Second address Key:3 does not exist Hash Table ------------------ Index:49, Key:1, Value:First address Index:50, Key:2, Value:Second address ------------------
这似乎按预期工作,所以现在让我们继续处理冲突。
处理碰撞
有多种解决冲突的方法。
我们将研究一种称为"独立链接"的方法,该方法旨在为具有相同哈希索引的所有项目创建独立的链。
我们将使用链接列表创建这些链。
每当发生碰撞时,我们都会添加其他项目,这些项目会在"溢出存储桶列表"的同一索引上发生碰撞。
因此,我们将不必删除哈希表上的任何现有记录。
由于链表在插入,搜索和删除时具有O(n)时间复杂度,因此在发生冲突的情况下,我们也将具有O(n)的最坏情况访问时间。
此方法的优点是,如果您的哈希表的容量较低,则是一个不错的选择。
使用链表进行单独链接-维基百科
涵盖了这一点,让我们开始实施我们的旧链接列表!
typedef struct LinkedList LinkedList;
//Define the Linkedlist here
struct LinkedList {
Ht_item* item;
LinkedList* next;
};
LinkedList* allocate_list () {
//Allocates memory for a Linkedlist pointer
LinkedList* list = (LinkedList*) malloc (sizeof(LinkedList));
return list;
}
LinkedList* linkedlist_insert(LinkedList* list, Ht_item* item) {
//Inserts the item onto the Linked List
if (!list) {
LinkedList* head = allocate_list();
head->item = item;
head->next = NULL;
list = head;
return list;
}
else if (list->next == NULL) {
LinkedList* node = allocate_list();
node->item = item;
node->next = NULL;
list->next = node;
return list;
}
LinkedList* temp = list;
while (temp->next->next) {
temp = temp->next;
}
LinkedList* node = allocate_list();
node->item = item;
node->next = NULL;
temp->next = node;
return list;
}
Ht_item* linkedlist_remove(LinkedList* list) {
//Removes the head from the linked list
//and returns the item of the popped element
if (!list)
return NULL;
if (!list->next)
return NULL;
LinkedList* node = list->next;
LinkedList* temp = list;
temp->next = NULL;
list = node;
Ht_item* it = NULL;
memcpy(temp->item, it, sizeof(Ht_item));
free(temp->item->key);
free(temp->item->value);
free(temp->item);
free(temp);
return it;
}
void free_linkedlist(LinkedList* list) {
LinkedList* temp = list;
while (list) {
temp = list;
list = list->next;
free(temp->item->key);
free(temp->item->value);
free(temp->item);
free(temp);
}
}
现在,我们需要将这些"溢出存储桶"列表添加到哈希表中。
我们希望每个项目都有一个这样的链,因此对于整个表,它是LinkedList指针的数组。
typedef struct HashTable HashTable;
//Define the Hash Table here
struct HashTable {
//Contains an array of pointers
//to items
Ht_item** items;
LinkedList** overflow_buckets;
int size;
int count;
};
现在,我们定义了overflow_buckets,我们将添加函数来创建和删除它们。
我们还需要在旧的create_table()和free_table()函数中考虑它们。
LinkedList** create_overflow_buckets(HashTable* table) {
//Create the overflow buckets; an array of linkedlists
LinkedList** buckets = (LinkedList**) calloc (table->size, sizeof(LinkedList*));
for (int i=0; i<table->size; i++)
buckets[i] = NULL;
return buckets;
}
void free_overflow_buckets(HashTable* table) {
//Free all the overflow bucket lists
LinkedList** buckets = table->overflow_buckets;
for (int i=0; i<table->size; i++)
free_linkedlist(buckets[i]);
free(buckets);
}
HashTable* create_table(int size) {
//Creates a new HashTable
HashTable* table = (HashTable*) malloc (sizeof(HashTable));
table->size = size;
table->count = 0;
table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*));
for (int i=0; i<table->size; i++)
table->items[i] = NULL;
table->overflow_buckets = create_overflow_buckets(table);
return table;
}
void free_table(HashTable* table) {
//Frees the table
for (int i=0; i<table->size; i++) {
Ht_item* item = table->items[i];
if (item != NULL)
free_item(item);
}
//Free the overflow bucket linked linkedlist and it's items
free_overflow_buckets(table);
free(table->items);
free(table);
}
!我们还有工作要做。
现在,转到" handle_collision()"函数。
这里有两种情况。
如果该项目的溢出存储区列表不存在,我们需要创建一个这样的列表并将该项目添加到其中。
否则,我们可以简单地将项目插入列表
void handle_collision(HashTable* table, unsigned long index, Ht_item* item) {
LinkedList* head = table->overflow_buckets[index];
if (head == NULL) {
//We need to create the list
head = allocate_list();
head->item = item;
table->overflow_buckets[index] = head;
return;
}
else {
//Insert to the list
table->overflow_buckets[index] = linkedlist_insert(head, item);
return;
}
}
插入完成了,但是现在,我们还需要更新搜索功能,因为我们可能还需要查看溢出存储桶。
char* ht_search(HashTable* table, char* key) {
//Searches the key in the hashtable
//and returns NULL if it doesn't exist
int index = hash_function(key);
Ht_item* item = table->items[index];
LinkedList* head = table->overflow_buckets[index];
//Ensure that we move to items which are not NULL
while (item != NULL) {
if (strcmp(item->key, key) == 0)
return item->value;
if (head == NULL)
return NULL;
item = head->item;
head = head->next;
}
return NULL;
}
最后,我们考虑了" insert()"和" search()"中的冲突!
为了提醒您代码的当前状态,我将完整的程序发布到现在。
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define CAPACITY 50000 //Size of the Hash Table
unsigned long hash_function(char* str) {
unsigned long i = 0;
for (int j=0; str[j]; j++)
i += str[j];
return i % CAPACITY;
}
typedef struct Ht_item Ht_item;
//Define the Hash Table Item here
struct Ht_item {
char* key;
char* value;
};
typedef struct LinkedList LinkedList;
//Define the Linkedlist here
struct LinkedList {
Ht_item* item;
LinkedList* next;
};
typedef struct HashTable HashTable;
//Define the Hash Table here
struct HashTable {
//Contains an array of pointers
//to items
Ht_item** items;
LinkedList** overflow_buckets;
int size;
int count;
};
static LinkedList* allocate_list () {
//Allocates memory for a Linkedlist pointer
LinkedList* list = (LinkedList*) malloc (sizeof(LinkedList));
return list;
}
static LinkedList* linkedlist_insert(LinkedList* list, Ht_item* item) {
//Inserts the item onto the Linked List
if (!list) {
LinkedList* head = allocate_list();
head->item = item;
head->next = NULL;
list = head;
return list;
}
else if (list->next == NULL) {
LinkedList* node = allocate_list();
node->item = item;
node->next = NULL;
list->next = node;
return list;
}
LinkedList* temp = list;
while (temp->next->next) {
temp = temp->next;
}
LinkedList* node = allocate_list();
node->item = item;
node->next = NULL;
temp->next = node;
return list;
}
static Ht_item* linkedlist_remove(LinkedList* list) {
//Removes the head from the linked list
//and returns the item of the popped element
if (!list)
return NULL;
if (!list->next)
return NULL;
LinkedList* node = list->next;
LinkedList* temp = list;
temp->next = NULL;
list = node;
Ht_item* it = NULL;
memcpy(temp->item, it, sizeof(Ht_item));
free(temp->item->key);
free(temp->item->value);
free(temp->item);
free(temp);
return it;
}
static void free_linkedlist(LinkedList* list) {
LinkedList* temp = list;
while (list) {
temp = list;
list = list->next;
free(temp->item->key);
free(temp->item->value);
free(temp->item);
free(temp);
}
}
static LinkedList** create_overflow_buckets(HashTable* table) {
//Create the overflow buckets; an array of linkedlists
LinkedList** buckets = (LinkedList**) calloc (table->size, sizeof(LinkedList*));
for (int i=0; i<table->size; i++)
buckets[i] = NULL;
return buckets;
}
static void free_overflow_buckets(HashTable* table) {
//Free all the overflow bucket lists
LinkedList** buckets = table->overflow_buckets;
for (int i=0; i<table->size; i++)
free_linkedlist(buckets[i]);
free(buckets);
}
Ht_item* create_item(char* key, char* value) {
//Creates a pointer to a new hash table item
Ht_item* item = (Ht_item*) malloc (sizeof(Ht_item));
item->key = (char*) malloc (strlen(key) + 1);
item->value = (char*) malloc (strlen(value) + 1);
strcpy(item->key, key);
strcpy(item->value, value);
return item;
}
HashTable* create_table(int size) {
//Creates a new HashTable
HashTable* table = (HashTable*) malloc (sizeof(HashTable));
table->size = size;
table->count = 0;
table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*));
for (int i=0; i<table->size; i++)
table->items[i] = NULL;
table->overflow_buckets = create_overflow_buckets(table);
return table;
}
void free_item(Ht_item* item) {
//Frees an item
free(item->key);
free(item->value);
free(item);
}
void free_table(HashTable* table) {
//Frees the table
for (int i=0; i<table->size; i++) {
Ht_item* item = table->items[i];
if (item != NULL)
free_item(item);
}
free_overflow_buckets(table);
free(table->items);
free(table);
}
void handle_collision(HashTable* table, unsigned long index, Ht_item* item) {
LinkedList* head = table->overflow_buckets[index];
if (head == NULL) {
//We need to create the list
head = allocate_list();
head->item = item;
table->overflow_buckets[index] = head;
return;
}
else {
//Insert to the list
table->overflow_buckets[index] = linkedlist_insert(head, item);
return;
}
}
void ht_insert(HashTable* table, char* key, char* value) {
//Create the item
Ht_item* item = create_item(key, value);
//Compute the index
unsigned long index = hash_function(key);
Ht_item* current_item = table->items[index];
if (current_item == NULL) {
//Key does not exist.
if (table->count == table->size) {
//Hash Table Full
printf("Insert Error: Hash Table is full\n");
//Remove the create item
free_item(item);
return;
}
//Insert directly
table->items[index] = item;
table->count++;
}
else {
//Scenario 1: We only need to update value
if (strcmp(current_item->key, key) == 0) {
strcpy(table->items[index]->value, value);
return;
}
else {
//Scenario 2: Collision
handle_collision(table, index, item);
return;
}
}
}
char* ht_search(HashTable* table, char* key) {
//Searches the key in the hashtable
//and returns NULL if it doesn't exist
int index = hash_function(key);
Ht_item* item = table->items[index];
LinkedList* head = table->overflow_buckets[index];
//Ensure that we move to items which are not NULL
while (item != NULL) {
if (strcmp(item->key, key) == 0)
return item->value;
if (head == NULL)
return NULL;
item = head->item;
head = head->next;
}
return NULL;
}
void print_search(HashTable* table, char* key) {
char* val;
if ((val = ht_search(table, key)) == NULL) {
printf("%s does not exist\n", key);
return;
}
else {
printf("Key:%s, Value:%s\n", key, val);
}
}
void print_table(HashTable* table) {
printf("\n-------------------\n");
for (int i=0; i<table->size; i++) {
if (table->items[i]) {
printf("Index:%d, Key:%s, Value:%s", i, table->items[i]->key, table->items[i]->value);
if (table->overflow_buckets[i]) {
printf(" => Overflow Bucket => ");
LinkedList* head = table->overflow_buckets[i];
while (head) {
printf("Key:%s, Value:%s ", head->item->key, head->item->value);
head = head->next;
}
}
printf("\n");
}
}
printf("-------------------\n");
}
int main() {
HashTable* ht = create_table(CAPACITY);
ht_insert(ht, "1", "First address");
ht_insert(ht, "2", "Second address");
ht_insert(ht, "Hel", "Third address");
ht_insert(ht, "Cau", "Fourth address");
print_search(ht, "1");
print_search(ht, "2");
print_search(ht, "3");
print_search(ht, "Hel");
print_search(ht, "Cau");
print_table(ht);
free_table(ht);
return 0;
}
从哈希表中删除
现在,让我们最后看看删除功能:
void ht_delete(HashTable* table, char* key);
同样,该方法类似于插入。
- 计算哈希索引并获取项目
- 如果为NULL,则无需执行任何操作
- 否则,在比较键之后,该索引没有碰撞链,只需从表中删除该项目
- 如果存在碰撞链,我们必须删除该元素并相应地移动链接
我没有为您提供太多细节,因为这仅涉及更新标题项和释放内存。
我的建议是尝试自己实施。
我将为您提供一个工作版本以进行比较。
void ht_delete(HashTable* table, char* key) {
//Deletes an item from the table
int index = hash_function(key);
Ht_item* item = table->items[index];
LinkedList* head = table->overflow_buckets[index];
if (item == NULL) {
//Does not exist. Return
return;
}
else {
if (head == NULL && strcmp(item->key, key) == 0) {
//No collision chain. Remove the item
//and set table index to NULL
table->items[index] = NULL;
free_item(item);
table->count--;
return;
}
else if (head != NULL) {
//Collision Chain exists
if (strcmp(item->key, key) == 0) {
//Remove this item and set the head of the list
//as the new item
free_item(item);
LinkedList* node = head;
head = head->next;
node->next = NULL;
table->items[index] = create_item(node->item->key, node->item->value);
free_linkedlist(node);
table->overflow_buckets[index] = head;
return;
}
LinkedList* curr = head;
LinkedList* prev = NULL;
while (curr) {
if (strcmp(curr->item->key, key) == 0) {
if (prev == NULL) {
//First element of the chain. Remove the chain
free_linkedlist(head);
table->overflow_buckets[index] = NULL;
return;
}
else {
//This is somewhere in the chain
prev->next = curr->next;
curr->next = NULL;
free_linkedlist(curr);
table->overflow_buckets[index] = head;
return;
}
}
curr = curr->next;
prev = curr;
}
}
}
}
全部放在一起
现在,最后,我将显示哈希表的整个程序。
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define CAPACITY 50000 //Size of the Hash Table
unsigned long hash_function(char* str) {
unsigned long i = 0;
for (int j=0; str[j]; j++)
i += str[j];
return i % CAPACITY;
}
typedef struct Ht_item Ht_item;
//Define the Hash Table Item here
struct Ht_item {
char* key;
char* value;
};
typedef struct LinkedList LinkedList;
//Define the Linkedlist here
struct LinkedList {
Ht_item* item;
LinkedList* next;
};
typedef struct HashTable HashTable;
//Define the Hash Table here
struct HashTable {
//Contains an array of pointers
//to items
Ht_item** items;
LinkedList** overflow_buckets;
int size;
int count;
};
static LinkedList* allocate_list () {
//Allocates memory for a Linkedlist pointer
LinkedList* list = (LinkedList*) malloc (sizeof(LinkedList));
return list;
}
static LinkedList* linkedlist_insert(LinkedList* list, Ht_item* item) {
//Inserts the item onto the Linked List
if (!list) {
LinkedList* head = allocate_list();
head->item = item;
head->next = NULL;
list = head;
return list;
}
else if (list->next == NULL) {
LinkedList* node = allocate_list();
node->item = item;
node->next = NULL;
list->next = node;
return list;
}
LinkedList* temp = list;
while (temp->next->next) {
temp = temp->next;
}
LinkedList* node = allocate_list();
node->item = item;
node->next = NULL;
temp->next = node;
return list;
}
static Ht_item* linkedlist_remove(LinkedList* list) {
//Removes the head from the linked list
//and returns the item of the popped element
if (!list)
return NULL;
if (!list->next)
return NULL;
LinkedList* node = list->next;
LinkedList* temp = list;
temp->next = NULL;
list = node;
Ht_item* it = NULL;
memcpy(temp->item, it, sizeof(Ht_item));
free(temp->item->key);
free(temp->item->value);
free(temp->item);
free(temp);
return it;
}
static void free_linkedlist(LinkedList* list) {
LinkedList* temp = list;
while (list) {
temp = list;
list = list->next;
free(temp->item->key);
free(temp->item->value);
free(temp->item);
free(temp);
}
}
static LinkedList** create_overflow_buckets(HashTable* table) {
//Create the overflow buckets; an array of linkedlists
LinkedList** buckets = (LinkedList**) calloc (table->size, sizeof(LinkedList*));
for (int i=0; i<table->size; i++)
buckets[i] = NULL;
return buckets;
}
static void free_overflow_buckets(HashTable* table) {
//Free all the overflow bucket lists
LinkedList** buckets = table->overflow_buckets;
for (int i=0; i<table->size; i++)
free_linkedlist(buckets[i]);
free(buckets);
}
Ht_item* create_item(char* key, char* value) {
//Creates a pointer to a new hash table item
Ht_item* item = (Ht_item*) malloc (sizeof(Ht_item));
item->key = (char*) malloc (strlen(key) + 1);
item->value = (char*) malloc (strlen(value) + 1);
strcpy(item->key, key);
strcpy(item->value, value);
return item;
}
HashTable* create_table(int size) {
//Creates a new HashTable
HashTable* table = (HashTable*) malloc (sizeof(HashTable));
table->size = size;
table->count = 0;
table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*));
for (int i=0; i<table->size; i++)
table->items[i] = NULL;
table->overflow_buckets = create_overflow_buckets(table);
return table;
}
void free_item(Ht_item* item) {
//Frees an item
free(item->key);
free(item->value);
free(item);
}
void free_table(HashTable* table) {
//Frees the table
for (int i=0; i<table->size; i++) {
Ht_item* item = table->items[i];
if (item != NULL)
free_item(item);
}
free_overflow_buckets(table);
free(table->items);
free(table);
}
void handle_collision(HashTable* table, unsigned long index, Ht_item* item) {
LinkedList* head = table->overflow_buckets[index];
if (head == NULL) {
//We need to create the list
head = allocate_list();
head->item = item;
table->overflow_buckets[index] = head;
return;
}
else {
//Insert to the list
table->overflow_buckets[index] = linkedlist_insert(head, item);
return;
}
}
void ht_insert(HashTable* table, char* key, char* value) {
//Create the item
Ht_item* item = create_item(key, value);
//Compute the index
unsigned long index = hash_function(key);
Ht_item* current_item = table->items[index];
if (current_item == NULL) {
//Key does not exist.
if (table->count == table->size) {
//Hash Table Full
printf("Insert Error: Hash Table is full\n");
//Remove the create item
free_item(item);
return;
}
//Insert directly
table->items[index] = item;
table->count++;
}
else {
//Scenario 1: We only need to update value
if (strcmp(current_item->key, key) == 0) {
strcpy(table->items[index]->value, value);
return;
}
else {
//Scenario 2: Collision
handle_collision(table, index, item);
return;
}
}
}
char* ht_search(HashTable* table, char* key) {
//Searches the key in the hashtable
//and returns NULL if it doesn't exist
int index = hash_function(key);
Ht_item* item = table->items[index];
LinkedList* head = table->overflow_buckets[index];
//Ensure that we move to items which are not NULL
while (item != NULL) {
if (strcmp(item->key, key) == 0)
return item->value;
if (head == NULL)
return NULL;
item = head->item;
head = head->next;
}
return NULL;
}
void ht_delete(HashTable* table, char* key) {
//Deletes an item from the table
int index = hash_function(key);
Ht_item* item = table->items[index];
LinkedList* head = table->overflow_buckets[index];
if (item == NULL) {
//Does not exist. Return
return;
}
else {
if (head == NULL && strcmp(item->key, key) == 0) {
//No collision chain. Remove the item
//and set table index to NULL
table->items[index] = NULL;
free_item(item);
table->count--;
return;
}
else if (head != NULL) {
//Collision Chain exists
if (strcmp(item->key, key) == 0) {
//Remove this item and set the head of the list
//as the new item
free_item(item);
LinkedList* node = head;
head = head->next;
node->next = NULL;
table->items[index] = create_item(node->item->key, node->item->value);
free_linkedlist(node);
table->overflow_buckets[index] = head;
return;
}
LinkedList* curr = head;
LinkedList* prev = NULL;
while (curr) {
if (strcmp(curr->item->key, key) == 0) {
if (prev == NULL) {
//First element of the chain. Remove the chain
free_linkedlist(head);
table->overflow_buckets[index] = NULL;
return;
}
else {
//This is somewhere in the chain
prev->next = curr->next;
curr->next = NULL;
free_linkedlist(curr);
table->overflow_buckets[index] = head;
return;
}
}
curr = curr->next;
prev = curr;
}
}
}
}
void print_search(HashTable* table, char* key) {
char* val;
if ((val = ht_search(table, key)) == NULL) {
printf("%s does not exist\n", key);
return;
}
else {
printf("Key:%s, Value:%s\n", key, val);
}
}
void print_table(HashTable* table) {
printf("\n-------------------\n");
for (int i=0; i<table->size; i++) {
if (table->items[i]) {
printf("Index:%d, Key:%s, Value:%s", i, table->items[i]->key, table->items[i]->value);
if (table->overflow_buckets[i]) {
printf(" => Overflow Bucket => ");
LinkedList* head = table->overflow_buckets[i];
while (head) {
printf("Key:%s, Value:%s ", head->item->key, head->item->value);
head = head->next;
}
}
printf("\n");
}
}
printf("-------------------\n");
}
int main() {
HashTable* ht = create_table(CAPACITY);
ht_insert(ht, "1", "First address");
ht_insert(ht, "2", "Second address");
ht_insert(ht, "Hel", "Third address");
ht_insert(ht, "Cau", "Fourth address");
print_search(ht, "1");
print_search(ht, "2");
print_search(ht, "3");
print_search(ht, "Hel");
print_search(ht, "Cau"); //Collision!
print_table(ht);
ht_delete(ht, "1");
ht_delete(ht, "Cau");
print_table(ht);
free_table(ht);
return 0;
}
输出
Key:1, Value:First address Key:2, Value:Second address 3 does not exist Key:Hel, Value:Third address Key:Cau, Value:Fourth address ------------------ Index:49, Key:1, Value:First address Index:50, Key:2, Value:Second address Index:281, Key:Hel, Value:Third address => Overflow Bucket => Key:Cau, Value:Fourth address ------------------ ------------------ Index:50, Key:2, Value:Second address Index:281, Key:Hel, Value:Third address ------------------

