Hamming distance海明距离

  In  information theory , the  Hamming distance  between two  strings  of equal length is the number of positions at which the corresponding symbols are different. Put another way, it measures the minimum number of  substitutions  required to change one string into the other, or the number of errors  that transformed one string into the other.
    在信息论中,两个等长字符串之间的海明距离是两个字符串对应位置的不同字符的个数。换句话说,它就是将一个字符串变换成另外一个字符串所需要替换的字符个数

Examples

The Hamming distance between:

  • "toned" and "roses" is 3.  “toned”和“roses”的海明距离是3。
  • 1011101 and 1001001 is 2.  “1011101”和“1001001”的海明距离是2.
  • 2173896 and 2233796 is 3.   “2173896”和“2233796”的海明距离是3.

For a fixed length  n , the Hamming distance is a  metric  on the vector space of the words of length n, as it obviously fulfills the conditions of non-negativity, identity of indiscernibles and symmetry, and it can be shown easily by  complete induction  that it satisfies the  triangle inequality  as well. The Hamming distance between two words  a  and  b  can also be seen as the  Hamming weight  of  a b  for an appropriate choice of the − operator.

对于一个固定的长度为n的海明距离是长度为n的词语向量空间中一种度量,因为它显然满足的非负,身份的不可分辨和对称性三个条件,并且它可以被完全归纳法容易地推导出它满足三角不等式。两个词之间的海明距离a和b中也可以看出,为A-B的一个合适的选择 - 运算符的海明权重。

Hamming distance海明距离_第1张图片
三bit位海明距离立方体

Hamming distance海明距离_第2张图片
100->011 海明距离为3, 010->111海明距离为2

Hamming distance海明距离_第3张图片
4bit位的超正方体模拟的海明距离
Hamming distance海明距离_第4张图片
0110-》1110 的海明距离为1,0100-》1001的海明距离为3.

For binary strings a and b the Hamming distance is equal to the number of ones (population count) in a XOR b. The metric space of length-n binary strings, with the Hamming distance, is known as the Hamming cube; it is equivalent as a metric space to the set of distances between vertices in a hypercube graph. One can also view a binary string of length n as a vector in  by treating each symbol in the string as a real coordinate; with this embedding, the strings form the vertices of an n-dimensional hypercube, and the Hamming distance of the strings is equivalent to the Manhattan distance between the vertices.
对于二进制字符串的a和b,海明距离为等于在a XOR b运算结果中1的个数(普遍算法)。度量空间长度-n的二进制字符串,与海明距离,被称为的海明立方体,它相当于一个超立方体图中的顶点之间的距离的集合作为一个度量空间。人们也可以将一个长度为n的二进制串作为一个向量,由处理中的每个符号的字符串作为一个真正的坐标;与此嵌入,字符串形成一个n维超立方体的顶点,和字符串的海明距离是相当于顶点之间的曼哈顿距离。

The Hamming distance is named after Richard Hamming, who introduced it in his fundamental paper onHamming codes Error detecting and error correcting codes in 1950.[1] It is used intelecommunication to count the number of flipped bits in a fixed-length binary word as an estimate of error, and therefore is sometimes called thesignal distance. Hamming weight analysis of bits is used in several disciplines including information theory, coding theory, and cryptography. However, for comparing strings of different lengths, or strings where not just substitutions but also insertions or deletions have to be expected, a more sophisticated metric like the Levenshtein distance is more appropriate. For q-ary strings over an alphabet of size q ≥ 2 the Hamming distance is applied in case of orthogonal modulation, while the Lee distance is used for phase modulation. If q = 2 or q = 3 both distances coincide.

The Hamming distance is also used in systematics as a measure of genetic distance.[2]

On a grid (such as a chessboard), the points at a Lee distance of 1 constitute the von Neumann neighborhood of that point.

海明距离是以理查德·卫斯里·海明(1950年,在汉明码错误检测和纠错码在的基本文件中介绍了海明距离)的名字命名的。它是用在电信,计算一个固定长度的二进制字的翻转位作为估计值的误差,因此有时也被称为信号距离。汉明权重分析位应用在信息理论,编码理论,密码学等多个学科。然而,对于比较不同的长度,或可以预期的不只是替换,插入或删除的字符串的字符串,如Levenshtein距离更复杂的度量值是比较合适的。 对于q元字母表Q≥2大小以上的字符串海明距离适用,而Lee distance是用于相位调制的正交调制的情况下。当q =2或q =3两种距离是一致的。海明距离也用于系统性遗传距离的度量。在一个网格(如一个棋盘)上,Lee距离为1的点构成的冯·诺伊曼的那点附近。


 

你可能感兴趣的:(Machine,Learning)