KMP 是一种改进的字符串匹配算法
1.POJ 2406
给定若干个长度 ≤ 1000000 的字符串,询问每个字符串最多是由多少个相同的子字符串重复连接而成的。
如:ababab 则最多有 3 个 ab 连接而成。
这道题乍看和KMP无关
实际上 算出KMP的末位 与 长度的差值就是循环字串的长度
整个长度能整除循环字串的长度则为最终答案,否则为1
别问我为什么,实践出真知。
#include
#include
using namespace std;
const int maxn = 1e6 + 1000;
char s[maxn];
int prefix[maxn];
int main() {
ios::sync_with_stdio(false);
cin.tie(0);
//freopen("in.txt", "r", stdin);
//freopen("out.txt", "w", stdout);
while(cin >> s + 1) {
if(s[1] == '.')
break;
int k = 0;
int len = strlen(s+1);
for(int i = 2; i <= len; i ++ ){
while(s[i] != s[k + 1] && k) k = prefix[k];
if(s[i] != s[k+1]) {
prefix[i] = 0;
}else {
k++;
prefix[i] = k;
}
}
// for(int i = 1; i <= len; i++) {
// cout << i << ":" << prefix[i] << endl;
// }
int n = len - prefix[len];
if(len%n == 0)
cout << len / n <
2.POJ 2752
给定若干只含小写字母的字符串(这些字符串总长≤400000),在每个字符串中求出所有既是前缀又是后缀的子串长度。
例如:ababcababababcabab,既是前缀又是后缀的子串:ab,abab,ababcabab,ababcababababcabab。
拿第一个做例子 最后 prefix[18] = 9 prefix[9] = 4 prefix[4] = 2 prefix[2] = 0
所以答案是 2 4 9 18
其他题目也是一样的
别问我为什么,实践出真知
#include
#include
using namespace std;
const int maxn = 4e5 + 50;
char s[maxn];
int prefix[maxn];
void print(int index){
if(index){
print(prefix[index]);
cout << index << " ";
}
}
int main() {
ios::sync_with_stdio(false);
cin.tie(0);
while(cin >> (s + 1)) {
int k = 0;
int len = strlen(s + 1);
for(int i = 2; i <= len; i++) {
while(s[i] != s[k + 1] && k) k = prefix[k];
if(s[i] == s[k+1]) {
k ++;
prefix[i] = k;
}else
prefix[i] = 0;
}
print(len);
cout << endl;
}
return 0;
}
3.HDU 1711
给出两个数字序列 : a[1], a[2], … , a[N], 和 b[1], b[2], … , b[M] (1 <= M <= 10000, 1 <= N <= 1000000). 你的任务是找到一个数字K满足: a[K] = b[1], a[K + 1] = b[2], … , a[K + M - 1] = b[M]. 如果有多个K满足题意,请你输出最小的那个K。
输入
第一行是一个数字T代表有T组测试数据. 每组测试数据有三行. 第一行是两个数字 N and M (1 <= M <= 10000, 1 <= N <= 1000000). 第二行包括N个整数代表a[1], a[2], … , a[N]. 第三行包括M个整数代表b[1], b[2], … , b[M]. 所有数字的范围在[-1000000, 1000000]之间。
一道经典的关于字符串匹配的问题
#include
using namespace std;
const int maxn1 = 1e6 + 1000;
const int maxn2 = 1e4 + 10;
int a[maxn1];
int b[maxn2];
int prefix[maxn2];
int main() {
//freopen("in.txt", "r", stdin);
//freopen("out.txt", "w", stdout);
ios::sync_with_stdio(false);
cin.tie(0);
int T;
cin >> T;
while (T--) {
int n, m;
cin >> n >> m;
for (int i = 1; i <= n; i++) {
cin >> a[i];
}
for (int i = 1; i <= m; i++) {
cin >> b[i];
}
//KMP
int k = 0;
for (int i = 2; i <= m; i++) {
while (k && b[i] != b[k + 1]) k = prefix[k];
if (b[i] != b[k + 1]) {
prefix[i] = 0;
}
else {
k++;
prefix[i] = k;
}
}
// for(int i = 1; i <= m ;i++) {
// cout << prefix[i] << " ";
// }
k = 0;
bool flag = false;
int j;
for (j = 1; j <= n ; j++) {
while (a[j] != b[k + 1] && k) k = prefix[k];
if (a[j] != b[k + 1]) {
k = 0;
}
else {
k++;
}
if (k == m) {
flag = true;
break;
}
}
cout << (flag ? j - m + 1 : -1) << endl;
}
//system("pause");
return 0;
}
4.codeforce Compress Words Round#578 DIV2 E题
Amugae has a sentence consisting of n words. He want to compress this sentence into one word. Amugae doesn’t like repetitions, so when he merges two words into one word, he removes the longest prefix of the second word that coincides with a suffix of the first word. For example, he merges “sample” and “please” into “samplease”.
Amugae will merge his sentence left to right (i.e. first merge the first two words, then merge the result with the third word and so on). Write a program that prints the compressed word after the merging process ends.
Input
The first line contains an integer n (1≤n≤105), the number of the words in Amugae’s sentence.
The second line contains n words separated by single space. Each words is non-empty and consists of uppercase and lowercase English letters and digits (‘A’, ‘B’, …, ‘Z’, ‘a’, ‘b’, …, ‘z’, ‘0’, ‘1’, …, ‘9’). The total length of the words does not exceed 10^6
大意就是尽可能的将第一个字符串的后缀和第二个字符串的前缀匹配,然后就可以消掉
然后出来的新的字符串和第三个字符串再匹配 ………… 让这个字符串的长度最小,所有的字符串长度不会超过10^6
这道题,貌似不用KMP 可以以900多msAc过去,不过考察的应该就是KMP
一道有点变式的KMP
#include
using namespace std;
const int maxn = 1e6 + 1000;
int N = 0;
int prefix[maxn];
char s[maxn];
char S[maxn];
int main() {
ios::sync_with_stdio(false);
cin.tie(0);
int n;
cin >> n;
while(n--){
int k = 0;
cin >> s + 1;
int len = strlen(s+1);
for(int i = 2; i <= len; i++) {
while(s[i] != s[k+1] && k) k = prefix[k];
if(s[i] != s[k+1]) {
prefix[i] = 0;
}else {
k++;
prefix[i] = k;
}
}
int j = max(1, N - len + 1);
k = 0;
for( ; j <= N; j++) {
while(s[k+1] != S[j] && k) k = prefix[k];
if(s[k+1] != S[j]) {
k = 0;
}else {
k ++;
}
}
for(k++; k <=len ;k ++) {
S[++N] = s[k];
}
}
for(int i = 1;i <=N;i++) cout << S[i];
//system("pause");
}