一些哈希算法的了解

1、Java的HashCode

/** The value is used for character storage. */
    private final char value[];//把String缓存成字符数组

/* @param  value
     *         The initial value of the string
     */
    public String(char value[]) {
        this.value = Arrays.copyOf(value, value.length);
    }

/** Cache the hash code for the string */
    private int hash; // Default to 0

/** s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1] 
  *31*i==(i<<5)-i
  */
public int hashCode() {
        int h = hash;
        if (h == 0 && value.length > 0) {
            char val[] = value;

            for (int i = 0; i < value.length; i++) {
                h = 31 * h + val[i];
            }
            hash = h;
        }
        return h;
    }

这个Hash函数使用到了Horner法则,霍纳法则是求多项式值的一个快速算法:
a0*x0+a1*x1+a2*x2+a3*x3+……an*x^n //n* (n+1)/2 次乘法
=(((……(((an+an-1)*x+an-2)*x+an-3)*x)+……)*x+a1)*x+a0 //复杂度 为 O(n)
性能不够好,变化不够激烈

2、MurmurHash

multiply and rotate,算法的核心就是不断的"x *= m; x <<= r;"
在Java的实现,可以使用Guava的Hashing类

//import import com.google.common.hash.*;
HashFunction f = Hashing.murmur3_32();
        int hash = f.hashBytes("hello".getBytes()).asInt();

源码。。。暂时还看不懂
https://github.com/aappleby/smhasher/tree/master/src
看别人的博客总结,大概是这样

unsigned long long MurmurHash64B ( const void * key, int len, unsigned int seed )
{
	const unsigned int m = 0x5bd1e995;
	const int r = 24;
 
	unsigned int h1 = seed ^ len;
	unsigned int h2 = 0;
 
	const unsigned int * data = (const unsigned int *)key;
 
	while(len >= 8)
	{
		unsigned int k1 = *data++;
		k1 *= m; k1 ^= k1 >> r; k1 *= m;
		h1 *= m; h1 ^= k1;
		len -= 4;
 
		unsigned int k2 = *data++;
		k2 *= m; k2 ^= k2 >> r; k2 *= m;
		h2 *= m; h2 ^= k2;
		len -= 4;
	}
 
	if(len >= 4)
	{
		unsigned int k1 = *data++;
		k1 *= m; k1 ^= k1 >> r; k1 *= m;
		h1 *= m; h1 ^= k1;
		len -= 4;
	}
 
	switch(len)
	{
	case 3: h2 ^= ((unsigned char*)data)[2] << 16;
	case 2: h2 ^= ((unsigned char*)data)[1] << 8;
	case 1: h2 ^= ((unsigned char*)data)[0];
			h2 *= m;
	};
 
	h1 ^= h2 >> 18; h1 *= m;
	h2 ^= h1 >> 22; h2 *= m;
	h1 ^= h2 >> 17; h1 *= m;
	h2 ^= h1 >> 19; h2 *= m;
 
	unsigned long long h = h1;
 
	h = (h << 32) | h2;
 
	return h;
} 

参数说明:
key:字符串
len:字符串长度
seed:种子,最好用一个质数,0xEE6B27EB,一个40亿内的质数

3、cityHash

https://github.com/google/cityhash
性能略胜MurmurHash算法
SipHash
xxHash
wyhash

4、还有一些好像比较旧的比如FVN,Go程序中用到
FNV-0

hash = offset_basis
for each octet_of_data to be hashed
    hash = hash xor octet_of_data
    hash = hash * FNV_prime
return hash

FNV-1

hash = offset_basis
for each octet_of_data to be hashed
    hash = hash * FNV_prime
    hash = hash xor octet_of_data
return hash

FNV-1a

hash = offset_basis
for each octet_of_data to be hashed
    hash = hash xor octet_of_data
    hash = hash * FNV_prime
return hash

以上都是非加密哈希算法,加密哈希算法的一个特点是,即使你知道哈希值,也很难伪造有同样哈希值的文本,比如MD5

参考:

https://blog.csdn.net/gray_1566/article/details/24697937
https://blog.csdn.net/qq_33408113/article/details/82635009
https://blog.csdn.net/u013137970/article/details/79020095
https://blog.csdn.net/wisage/article/details/7104866
https://segmentfault.com/a/1190000010990136

你可能感兴趣的:(数据结构)