KMP 算法

KMP 算法要解决的是在字符串 S 中寻找模式字符串 P 的问题。

naive 的方法是两重循环,时间复杂度 O(m*n)。KMP 的时间复杂度为 O(m+n)。

其实并不复杂,分两步:

  1. 求出 P 的 Partial Match Table
  2. 借助 table 搜索 S

时间复杂度 O(m+n)。关键步骤是求出 P 的 Partial Match Table,其含义是

The length of the longest proper prefix in the (sub)pattern that matches a proper suffix in the same (sub)pattern

其中,

Proper prefix: All the characters in a string, with one or more cut off the end. “S”, “Sn”, “Sna”, and “Snap” are all the proper prefixes of “Snape”
Proper suffix: All the characters in a string, with one or more cut off the beginning. “agrid”, “grid”, “rid”, “id”, and “d” are all proper suffixes of “Hagrid”

实现如下

public int[] kmpTable(String p) {
    // 一开始是声明 p.length() 长度的数组来表示相应位的状态,但是 table[1] = 1 时会一直循环
    int[] table = new int[p.length()+1];
    int i = 2, cnt = 0;
    while (i <= p.length()) {
        if (p.charAt(i - 1) == p.charAt(cnt)) {
            table[i++] = ++cnt;
        } else if (cnt > 0) {
            cnt = table[cnt];
        } else {
            table[i++] = 0;
        }
    }
    return table;
}

参考文献

  1. http://jakeboxer.com/blog/2009/12/13/the-knuth-morris-pratt-algorithm-in-my-own-words/

你可能感兴趣的:(KMP 算法)