数据结构-散列表的含义与C++实现

这篇具有很好参考价值的文章主要介绍了数据结构-散列表的含义与C++实现。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

一、散列表的概念

二、散列函数的作用

三、散列表的查找技术

1. 直接寻址表

2. 线性探测法

3. 平方探测法

4. 双散列法

四、散列表的优缺点

五、总结

一、散列表的概念

散列表（Hash Table）是一种数据结构，它通过散列函数将关键字映射到散列表中的一个位置，从而实现快速的查找、插入和删除操作。散列表的基本思想是将关键字映射到散列表中的一个位置，这个位置就是关键字的散列地址。散列表的查找效率非常高，时间复杂度为O(1)。

二、散列函数的作用

散列函数是散列表的核心，它的作用是将关键字映射到散列表中的一个位置。散列函数的设计非常重要，它直接影响到散列表的性能。一个好的散列函数应该具有以下特点：

1. 散列函数的计算速度应该快，否则会影响散列表的性能。

2. 散列函数应该将关键字均匀地分布在散列表中，这样可以减少冲突的概率。

3. 散列函数应该尽量避免冲突，否则会影响散列表的性能。

常用的散列函数有以下几种：

1. 直接取模法：将关键字除以散列表的长度，取余数作为散列地址。

直接取模法是一种常见的哈希函数，可以用于哈希表等数据结构中。其实现方式如下：

unsigned int direct_mod_hash(const char* key, int table_size) {
    unsigned int hash_value = 0;
    for (int i = 0; key[i] != '\0'; i++) {
        hash_value = (hash_value * 31 + key[i]) % table_size;
    }
    return hash_value;
}

其中，`key`是要哈希的字符串，`table_size`是哈希表的大小。这个哈希函数的实现过程是，将字符串中的每个字符依次乘以31的幂次方，然后累加起来，最后对哈希表大小取模，得到哈希值。

需要注意的是，这个哈希函数可能会产生哈希冲突，因此在实际使用中需要采用一些解决冲突的方法，比如开放寻址法或者链表法。

2. 平方取中法：将关键字的平方取中，再取余数作为散列地址。

平方取中法是一种随机数生成算法，其基本思想是：先取一个种子数，将其平方后取中间的若干位作为新的种子数，再将新的种子数平方后取中间的若干位作为随机数。以下是C++实现：

#include <iostream>
#include <cmath>

using namespace std;

int squareMiddle(int seed, int n) {
    int square = seed * seed;
    int len = 2 * n;
    int middle = (square / (int)pow(10, n - 1)) % (int)pow(10, len);
    return middle / (int)pow(10, n - 1);
}

int main() {
    int seed = 1234;
    int n = 2;
    for (int i = 0; i < 10; i++) {
        seed = squareMiddle(seed, n);
        cout << seed << endl;
    }
    return 0;
}

其中，`seed`为种子数，`n`为取中间位数的位数，`squareMiddle`函数实现了平方取中法的核心逻辑，返回新的种子数的中间位数。在`main`函数中，我们可以通过多次调用`squareMiddle`函数来生成多个随机数。

3. 折叠法：将关键字分成若干段，每段相加，再取余数作为散列地址。

折叠法（Folding Method）是一种哈希函数的实现方法，它将关键字按照一定的规则进行折叠，然后再进行哈希计算。下面是使用C++实现折叠法的示例代码：

#include <iostream>
#include <string>

using namespace std;

// 折叠法哈希函数
int foldingHash(string key, int tableSize) {
    int hashVal = 0;
    int len = key.length();
    int foldLen = len / 4;  // 每次折叠的长度
    for (int i = 0; i < len; i += foldLen) {
        int foldSum = 0;
        for (int j = 0; j < foldLen; j++) {
            if (i + j < len) {
                foldSum += key[i + j];
            }
        }
        hashVal += foldSum;
    }
    return hashVal % tableSize;
}

int main() {
    string key = "hello world";
    int tableSize = 10;
    int hashVal = foldingHash(key, tableSize);
    cout << "The hash value of " << key << " is " << hashVal << endl;
    return 0;
}

在上面的代码中，我们定义了一个`foldingHash`函数来实现折叠法哈希。该函数接受两个参数：关键字`key`和哈希表大小`tableSize`，并返回计算出的哈希值。在函数内部，我们首先计算出关键字的长度`len`，然后将关键字按照每次折叠的长度`foldLen`进行折叠，并将每次折叠后的结果累加到哈希值`hashVal`中。最后，我们将哈希值对哈希表大小取模，得到最终的哈希值。

在主函数中，我们定义了一个字符串`key`和哈希表大小`tableSize`，然后调用`foldingHash`函数计算出关键字的哈希值，并输出结果。

需要注意的是，折叠法哈希函数的实现方法有很多种，上面的代码只是其中一种实现方式。在实际应用中，我们需要根据具体的需求选择合适的实现方法。

4. 随机数法：随机生成一个数作为散列地址。

散列表是一种常用的数据结构，它可以实现快速的查找、插入和删除操作。其中，散列表的随机数法是一种常用的散列函数，它可以将关键字映射到散列表中的一个位置上，从而实现快速的查找。

下面是使用C++实现散列表的随机数法的示例代码：

#include <iostream>
#include <cstdlib>
#include <ctime>

using namespace std;

const int TABLE_SIZE = 10;

class HashNode {
public:
    int key;
    int value;
    HashNode(int key, int value) {
        this->key = key;
        this->value = value;
    }
};

class HashMap {
private:
    HashNode **table;
public:
    HashMap() {
        table = new HashNode*[TABLE_SIZE];
        for (int i = 0; i < TABLE_SIZE; i++) {
            table[i] = NULL;
        }
    }

    int hashFunction(int key) {
        srand(time(NULL));
        int random = rand() % TABLE_SIZE;
        return (key * random) % TABLE_SIZE;
    }

    void insert(int key, int value) {
        int hash = hashFunction(key);
        while (table[hash] != NULL && table[hash]->key != key) {
            hash = hashFunction(hash + 1);
        }
        if (table[hash] != NULL) {
            delete table[hash];
        }
        table[hash] = new HashNode(key, value);
    }

    int search(int key) {
        int hash = hashFunction(key);
        while (table[hash] != NULL && table[hash]->key != key) {
            hash = hashFunction(hash + 1);
        }
        if (table[hash] == NULL) {
            return -1;
        } else {
            return table[hash]->value;
        }
    }

    void remove(int key) {
        int hash = hashFunction(key);
        while (table[hash] != NULL) {
            if (table[hash]->key == key) {
                break;
            }
            hash = hashFunction(hash + 1);
        }
        if (table[hash] == NULL) {
            cout << "Key not found" << endl;
        } else {
            delete table[hash];
            table[hash] = NULL;
            cout << "Key deleted" << endl;
        }
    }

    ~HashMap() {
        for (int i = 0; i < TABLE_SIZE; i++) {
            if (table[i] != NULL) {
                delete table[i];
            }
            delete[] table;
        }
    }
};

int main() {
    HashMap map;
    map.insert(1, 10);
    map.insert(2, 20);
    map.insert(3, 30);
    map.insert(4, 40);
    map.insert(5, 50);
    map.insert(6, 60);
    map.insert(7, 70);
    map.insert(8, 80);
    map.insert(9, 90);
    map.insert(10, 100);
    cout << "Value for key 6: " << map.search(6) << endl;
    map.remove(6);
    cout << "Value for key 6: " << map.search(6) << endl;
    return 0;
}

在上面的代码中，我们使用了随机数法来实现散列表的散列函数。具体来说，我们使用了C++中的rand()函数来生成一个随机数，然后将关键字乘以这个随机数，再对散列表的大小取模，从而得到关键字在散列表中的位置。

在散列表的插入、查找和删除操作中，我们同样使用了随机数法来解决冲突。具体来说，当我们发现散列表中的某个位置已经被占用时，我们就使用散列函数计算出下一个位置，直到找到一个空闲的位置为止。

需要注意的是，在使用随机数法时，我们需要使用srand()函数来初始化随机数生成器。具体来说，我们可以使用time(NULL)函数来获取当前时间的秒数，然后将其作为srand()函数的参数，从而生成一个随机的种子。这样可以保证每次运行程序时，生成的随机数序列都是不同的。

三、散列表的查找技术

散列表的查找技术主要有以下几种：

1. 直接寻址表

直接寻址表是一种简单的散列表，它的散列函数是将关键字直接作为散列地址。直接寻址表的查找效率非常高，时间复杂度为O(1)。但是，直接寻址表的空间复杂度非常高，因为它需要开辟一个大小为关键字范围的数组。

散列表的直接寻址表是一种简单的散列表实现方式，它使用一个数组来存储散列表中的元素，数组的下标就是元素的关键字。下面是用C++实现散列表的直接寻址表的代码：

#include <iostream>
#include <cstring>

using namespace std;

const int MAX_SIZE = 100;

class HashTable {
public:
    HashTable() {
        memset(table, 0, sizeof(table));
    }

    void insert(int key, int value) {
        table[key] = value;
    }

    int search(int key) {
        return table[key];
    }

    void remove(int key) {
        table[key] = 0;
    }

private:
    int table[MAX_SIZE];
};

int main() {
    HashTable ht;
    ht.insert(1, 10);
    ht.insert(2, 20);
    ht.insert(3, 30);

    cout << ht.search(1) << endl; // 输出10
    cout << ht.search(2) << endl; // 输出20
    cout << ht.search(3) << endl; // 输出30

    ht.remove(2);
    cout << ht.search(2) << endl; // 输出0

    return 0;
}

在上面的代码中，我们定义了一个`HashTable`类，它包含了`insert`、`search`和`remove`三个方法，分别用于插入、查找和删除元素。在构造函数中，我们使用`memset`函数将数组初始化为0。`insert`方法将元素插入到数组中，`search`方法根据关键字查找元素，`remove`方法将元素从数组中删除。在`main`函数中，我们创建了一个`HashTable`对象，插入了三个元素，并进行了一些查找和删除操作。

2. 线性探测法

线性探测法是一种解决冲突的方法，它的散列函数是将关键字映射到散列表中的一个位置，如果这个位置已经被占用了，就往后找一个空闲位置。线性探测法的查找效率比直接寻址表稍微低一些，时间复杂度为O(1)~O(n)。

下面是使用C++实现散列表的线性探测法的示例代码：

#include <iostream>
#include <string>

using namespace std;

const int TABLE_SIZE = 10;

class HashNode {
public:
    int key;
    int value;
    HashNode(int key, int value) {
        this->key = key;
        this->value = value;
    }
};

class HashMap {
private:
    HashNode **table;
public:
    HashMap() {
        table = new HashNode*[TABLE_SIZE];
        for (int i = 0; i < TABLE_SIZE; i++) {
            table[i] = NULL;
        }
    }

    int get(int key) {
        int hash = (key % TABLE_SIZE);
        while (table[hash] != NULL && table[hash]->key != key) {
            hash = (hash + 1) % TABLE_SIZE;
        }
        if (table[hash] == NULL) {
            return -1;
        } else {
            return table[hash]->value;
        }
    }

    void put(int key, int value) {
        int hash = (key % TABLE_SIZE);
        while (table[hash] != NULL && table[hash]->key != key) {
            hash = (hash + 1) % TABLE_SIZE;
        }
        if (table[hash] != NULL) {
            delete table[hash];
        }
        table[hash] = new HashNode(key, value);
    }

    ~HashMap() {
        for (int i = 0; i < TABLE_SIZE; i++) {
            if (table[i] != NULL) {
                delete table[i];
            }
        }
        delete[] table;
    }
};

int main() {
    HashMap map;
    map.put(1, 10);
    map.put(2, 20);
    map.put(3, 30);
    map.put(4, 40);
    map.put(5, 50);
    map.put(6, 60);
    map.put(7, 70);
    map.put(8, 80);
    map.put(9, 90);
    map.put(10, 100);

    cout << map.get(1) << endl;
    cout << map.get(2) << endl;
    cout << map.get(3) << endl;
    cout << map.get(4) << endl;
    cout << map.get(5) << endl;
    cout << map.get(6) << endl;
    cout << map.get(7) << endl;
    cout << map.get(8) << endl;
    cout << map.get(9) << endl;
    cout << map.get(10) << endl;

    return 0;
}

在这个示例代码中，我们定义了一个`HashNode`类来表示散列表中的每个节点，其中包含了键和值两个属性。然后我们定义了一个`HashMap`类来表示散列表，其中使用了线性探测法来解决哈希冲突。在`get`方法中，我们首先计算出给定键的哈希值，然后在散列表中查找该键对应的节点，如果找到了就返回该节点的值，否则返回-1。在`put`方法中，我们也首先计算出给定键的哈希值，然后在散列表中查找该键对应的节点，如果找到了就更新该节点的值，否则就创建一个新的节点并插入到散列表中。最后，在程序的主函数中，我们演示了如何使用`HashMap`类来操作散列表。

3. 平方探测法

平方探测法是一种解决冲突的方法，它的散列函数是将关键字映射到散列表中的一个位置，如果这个位置已经被占用了，就往后找一个空闲位置，每次往后找的步长是原来的平方。平方探测法的查找效率比线性探测法稍微高一些，时间复杂度为O(1)~O(n)。

以下是C++实现散列表的平方探测法的示例代码：

#include <iostream>
#include <vector>

using namespace std;

class HashTable {
private:
    vector<int> table;
    int capacity;
    int size;
    int hash(int key) {
        return key % capacity;
    }
public:
    HashTable(int capacity) {
        this->capacity = capacity;
        this->size = 0;
        this->table = vector<int>(capacity, -1);
    }
    bool insert(int key) {
        if (size == capacity) {
            return false;
        }
        int index = hash(key);
        int i = 1;
        while (table[index] != -1) {
            index = (hash(key) + i * i) % capacity;
            i++;
        }
        table[index] = key;
        size++;
        return true;
    }
    bool search(int key) {
        int index = hash(key);
        int i = 1;
        while (table[index] != -1) {
            if (table[index] == key) {
                return true;
            }
            index = (hash(key) + i * i) % capacity;
            i++;
        }
        return false;
    }
    bool remove(int key) {
        int index = hash(key);
        int i = 1;
        while (table[index] != -1) {
            if (table[index] == key) {
                table[index] = -1;
                size--;
                return true;
            }
            index = (hash(key) + i * i) % capacity;
            i++;
        }
        return false;
    }
};

int main() {
    HashTable ht(10);
    ht.insert(1);
    ht.insert(2);
    ht.insert(3);
    ht.insert(11);
    ht.insert(12);
    ht.insert(13);
    cout << ht.search(1) << endl; // 1
    cout << ht.search(4) << endl; // 0
    ht.remove(1);
    cout << ht.search(1) << endl; // 0
    return 0;
}

在这个实现中，我们使用了一个vector来存储散列表，每个元素的初始值为-1，表示该位置为空。在插入元素时，我们使用平方探测法来解决冲突，即如果当前位置已经被占用，则尝试下一个位置，直到找到一个空位置为止。在查找和删除元素时，我们也使用相同的方法来定位元素的位置。

4. 双散列法

双散列法是一种解决冲突的方法，它的散列函数是由两个散列函数组成，先用第一个散列函数计算出一个散列地址，如果这个位置已经被占用了，就用第二个散列函数计算出一个步长，然后往后找一个空闲位置。双散列法的查找效率比线性探测法和平方探测法都要高一些，时间复杂度为O(1)。

双散列法是一种解决散列表中哈希冲突的方法，它使用两个不同的哈希函数来计算键的哈希值，并在发生冲突时使用第二个哈希函数来计算新的哈希值。下面是使用C++实现双散列法的散列表的示例代码：

#include <iostream>
#include <vector>

using namespace std;

// 定义哈希表中的元素类型
struct HashNode {
    int key;
    int value;
    HashNode(int k, int v) : key(k), value(v) {}
};

// 定义哈希表类
class HashTable {
private:
    vector<HashNode*> table; // 哈希表数组
    int capacity; // 哈希表容量
    int size; // 哈希表中元素个数
    int prime1; // 第一个哈希函数使用的质数
    int prime2; // 第二个哈希函数使用的质数

    // 哈希函数1
    int hash1(int key) {
        return key % prime1;
    }

    // 哈希函数2
    int hash2(int key) {
        return prime2 - (key % prime2);
    }

public:
    // 构造函数
    HashTable(int cap, int p1, int p2) : capacity(cap), prime1(p1), prime2(p2), size(0) {
        table.resize(capacity, nullptr);
    }

    // 插入元素
    void insert(int key, int value) {
        if (size == capacity) {
            cout << "Hash table is full!" << endl;
            return;
        }

        int index = hash1(key);
        int step = hash2(key);

        while (table[index] != nullptr) {
            index = (index + step) % capacity;
        }

        table[index] = new HashNode(key, value);
        size++;
    }

    // 查找元素
    int find(int key) {
        int index = hash1(key);
        int step = hash2(key);

        while (table[index] != nullptr && table[index]->key != key) {
            index = (index + step) % capacity;
        }

        if (table[index] == nullptr) {
            return -1;
        } else {
            return table[index]->value;
        }
    }

    // 删除元素
    void remove(int key) {
        int index = hash1(key);
        int step = hash2(key);

        while (table[index] != nullptr && table[index]->key != key) {
            index = (index + step) % capacity;
        }

        if (table[index] != nullptr) {
            delete table[index];
            table[index] = nullptr;
            size--;
        }
    }

    // 打印哈希表
    void print() {
        for (int i = 0; i < capacity; i++) {
            if (table[i] != nullptr) {
                cout << "Key: " << table[i]->key << ", Value: " << table[i]->value << endl;
            }
        }
    }
};

int main() {
    HashTable ht(10, 7, 5);
    ht.insert(1, 10);
    ht.insert(2, 20);
    ht.insert(3, 30);
    ht.insert(4, 40);
    ht.insert(5, 50);
    ht.insert(6, 60);
    ht.insert(7, 70);
    ht.insert(8, 80);
    ht.insert(9, 90);
    ht.insert(10, 100);
    ht.print();

    cout << "Find key 5, value is " << ht.find(5) << endl;
    cout << "Find key 11, value is " << ht.find(11) << endl;

    ht.remove(5);
    ht.remove(11);
    ht.print();

    return 0;
}

在上面的代码中，我们定义了一个`HashTable`类来表示哈希表，其中包含了哈希表数组`table`、哈希表容量`capacity`、哈希表中元素个数`size`、第一个哈希函数使用的质数`prime1`、第二个哈希函数使用的质数`prime2`。我们使用`vector`来实现哈希表数组，使用`HashNode`结构体来表示哈希表中的元素。

在`HashTable`类中，我们定义了两个哈希函数`hash1`和`hash2`，分别用来计算键的哈希值。在插入元素时，我们先使用第一个哈希函数计算键的哈希值，如果该位置已经被占用，则使用第二个哈希函数计算新的哈希值，直到找到一个空闲位置插入元素。在查找和删除元素时，我们也是使用类似的方法来定位元素的位置。

最后，我们在`main`函数中创建一个`HashTable`对象，并插入一些元素，然后查找和删除一些元素，并打印哈希表中的所有元素。

四、散列表的优缺点

散列表的优点是查找效率非常高，时间复杂度为O(1)。散列表的缺点是需要消耗大量的空间，因为它需要开辟一个大小为关键字范围的数组。另外，散列表的性能受到散列函数的影响，如果散列函数设计不好，会导致冲突的概率增加，从而影响散列表的性能。

五、总结

散列表是一种非常重要的数据结构，它可以实现快速的查找、插入和删除操作。散列表的核心是散列函数，一个好的散列函数可以提高散列表的性能。散列表的查找技术主要有直接寻址表、线性探测法、平方探测法和双散列法。散列表的优点是查找效率非常高，缺点是需要消耗大量的空间。文章来源地址https://www.toymoban.com/news/detail-476134.html

到了这里，关于数据结构-散列表的含义与C++实现的文章就介绍完了。如果您还想了解更多内容，请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章，希望大家以后多多支持TOY模板网！