Oulipo
Time Limit: 3000/1000 MS (Java/Others) Memory Limit: 32768/32768 K (Java/Others)
Total Submission(s): 36811 Accepted Submission(s): 13875
The French author Georges Perec (1936–1982) once wrote a book, La disparition, without the letter ‘e’. He was a member of the Oulipo group. A quote from the book:
Tout avait Pair normal, mais tout s’affirmait faux. Tout avait Fair normal, d’abord, puis surgissait l’inhumain, l’affolant. Il aurait voulu savoir où s’articulait l’association qui l’unissait au roman : stir son tapis, assaillant à tout instant son imagination, l’intuition d’un tabou, la vision d’un mal obscur, d’un quoi vacant, d’un non-dit : la vision, l’avision d’un oubli commandant tout, où s’abolissait la raison : tout avait l’air normal mais…
Perec would probably have scored high (or rather, low) in the following contest. People are asked to write a perhaps even meaningful text on some subject with as few occurrences of a given “word” as possible. Our task is to provide the jury with a program that counts these occurrences, in order to obtain a ranking of the competitors. These competitors often write very long texts with nonsense meaning; a sequence of 500,000 consecutive 'T’s is not unusual. And they never use spaces.
So we want to quickly find out how often a word, i.e., a given string, occurs in a text. More formally: given the alphabet {‘A’, ‘B’, ‘C’, …, ‘Z’} and two finite strings over that alphabet, a word W and a text T, count the number of occurrences of W in T. All the consecutive characters of W must exactly match consecutive characters of T. Occurrences may overlap.
The first line of the input file contains a single number: the number of test cases to follow. Each test case has the following format:
One line with the word W, a string over {‘A’, ‘B’, ‘C’, …, ‘Z’}, with 1 ≤ |W| ≤ 10,000 (here |W| denotes the length of the string W).
One line with the text T, a string over {‘A’, ‘B’, ‘C’, …, ‘Z’}, with |W| ≤ |T| ≤ 1,000,000.
For every test case in the input file, the output should contain a single number, on a single line: the number of occurrences of the word W in the text T.
3
BAPC
BAPC
AZA
AZAZAZA
VERDI
AVERDXIVYERDIAN
1
3
0
输入case数,每个case由两个字符串p,s组成,求p在s中出现的次数
典型的KMP板子题,非常适合和我一样初学KMP的人。
要简单的修改一下KMP模板,P的长度为len,在求解next数组的时候,需要把next [len] 也求出来,这样在匹配到模板串的最后一位的时候,让 j = next [ j ] ,跳转到下一个匹配的位置。
举个例子: A Z A
next数组为:-1 0 -1 1
匹配的AZAZAZA的时候,首先匹配到AZAZAZA,然后 j = 3 = s t r l e n ( A Z A ) j=3=strlen(AZA) j=3=strlen(AZA),此是i指向Z,因为已经匹配完成,那么下一次匹配就从next[len]开始,因为如果从0开始的话,显然是错误的。因此 j 跳转到next[len]的位置。
要注意:
int
值和string.size()
比较大小,string.size()
返回的是一个unsigned int
,为无符号整数, ( i n t ) − 1 > ( u n s i g n e d i n t ) 1 (int)-1>(unsigned int)1 (int)−1>(unsignedint)1,因为编码方式不同, ( u n s i g n e d i n t ) - 1 = 2 32 − 1 = 4294967295 (unsigned int)- 1=2^{32}-1=4294967295 (unsignedint)-1=232−1=4294967295另外我本来想的是不改KMP算法,假如在s中找到p的位置为pos,就把s[pos]的字符改掉,重新走一遍KMP…然后就T掉了…初学还是理解的不够透彻啊QAQ
关于KMP算法的讲解请看大牛博客:KMP算法详解-彻底清楚了
之后有时间会写我个人的理解~
#include
#define MAXN 1000010
using namespace std;
int nex[MAXN];
char a[MAXN];
char b[MAXN];
void getNext(char *p){
int j=0;
int k=-1;
nex[0]=-1;
int len =strlen(p);
while (j<len){
if (k==-1 || p[j]==p[k]){
if (p[++j]==p[++k]){
nex[j]=nex[k];
}else {
nex[j]=k;
}
}else {
k=nex[k];
}
}
}
int kmp(char *s,char *p){
int len=strlen(p);
int len2=strlen(s);
int i=0,j=0;
int ans=0;
while (i< len2 && j < len){
if (j==-1 || s[i]==p[j]){
i++;
j++;
}else {
j=nex[j];
}
if (j==len){
ans++;
j=nex[j];
}
}
return ans;
}
int main (){
int n;
scanf("%d",&n);
while (n--){
scanf("%s%s",&a,&b);
int ans=0;
getNext(a);
printf("%d\n",kmp(b,a));
// 原始写法( 会T )
// int pos=kmp(b,a);
// while (pos!=-1){
// ans++;
// b[pos]='?';
// pos=kmp(b,a);
// }
// printf("%d\n",pos);
}
return 0;
}
注释掉的是C版本
#include
#include
#define MAXN 1000010
using namespace std;
int nex[MAXN];
//char a[MAXN];
//char b[MAXN];
void getNext(string p){
int len=p.size();
//void getNext(char *p){
// int len =strlen(p);
int j=0;
int k=-1;
nex[0]=-1;
while (j<len-1){
//这里也可以写 len,看自己的情况,本题就改成了len
if (k==-1 || p[j]==p[k]){
if (p[++j]==p[++k]){
nex[j]=nex[k];
}else {
nex[j]=k;
}
}else {
k=nex[k];
}
}
}
//int kmp(char *s,char *p){
// int len=strlen(p);
// int len2=strlen(s);
int kmp(string s,string p){
int len=p.size();
int len2=s.size();
int i=0,j=0;
int ans=0;
while (i< len2 && j < len){
if (j==-1 || s[i]==p[j]){
i++;
j++;
}else {
j=nex[j];
}
}
if (j>=len){
return i-j+1;
}else {
return -1;
}
}
int main (){
// std::ios::sync_with_stdio(false);
int n;
string a,b;
// while (scanf("%s%s",&a,&b)!=EOF){
while (cin>>a>>b){
getNext(a);
cout<<kmp(b,a)<<endl;
// printf("%d\n",kmp(b,a));
}
return 0;
}