hash table

[TOC]

direct address

适用于数量小且没有重复的key的情况
都是O(1)时间


hash table_第1张图片
image.png

hash table

with direct address, key k store in slot k; while with hashing, we have a hash function h(k)


hash table_第2张图片
image.png

collision

collision: two keys hash to the same slot.
solution: chaining; open addressing

chaining
hash table_第3张图片
image.png

hash table_第4张图片
image.png

load factor = n/m, n is total elements, m is the number of slots.

Search element
worst-case: all elements in the same slots
average case: O(1)
choose a simple uniform hashing (各个元素均匀地分布在各个slots)

Open addressing

每个slot只有一个element,但我们在search的时候,可能要search多个slots

hash function h 不仅需要input k, 还需要input一个i

hash table_第5张图片
image.png
hash table_第6张图片
image.png
hash table_第7张图片
image.png

当删除一个key的时候,不能找到这个key之后直接标记为null,而应当标记为deleted,不然会影响后面的search

Double hashing效果不错:h(i, k) = (h1(k) + i * h2(k)) mod m

linear probing

如果slot已经被占据了,就看下一个slot

会使得average search time 增加

quadratic probing

类似linear probing,但不是看下一个slot,而是通过加上一个次方的数再mode,寻找下一个探测是否能够插入的slot

double hashing

计算h的时候使用另外两个hash function

hash functions

a good hash function should be simple uniform hashing.

division method

h(k) = k mod m

get the remainder as h(k) ---- 取一个质数,并且这个质数和keys的分布没有关联

a prime not too close to an exact power of 2 is often a good choice of m

The multiplication method

image.png

你可能感兴趣的:(hash table)