HDU1686 Oulipo(KMP入门题)

题目传送门

Oulipo

Time Limit: 3000/1000 MS (Java/Others) Memory Limit: 32768/32768 K (Java/Others)
Total Submission(s): 36811 Accepted Submission(s): 13875

Problem Description

The French author Georges Perec (1936–1982) once wrote a book, La disparition, without the letter ‘e’. He was a member of the Oulipo group. A quote from the book:

Tout avait Pair normal, mais tout s’affirmait faux. Tout avait Fair normal, d’abord, puis surgissait l’inhumain, l’affolant. Il aurait voulu savoir où s’articulait l’association qui l’unissait au roman : stir son tapis, assaillant à tout instant son imagination, l’intuition d’un tabou, la vision d’un mal obscur, d’un quoi vacant, d’un non-dit : la vision, l’avision d’un oubli commandant tout, où s’abolissait la raison : tout avait l’air normal mais…

Perec would probably have scored high (or rather, low) in the following contest. People are asked to write a perhaps even meaningful text on some subject with as few occurrences of a given “word” as possible. Our task is to provide the jury with a program that counts these occurrences, in order to obtain a ranking of the competitors. These competitors often write very long texts with nonsense meaning; a sequence of 500,000 consecutive 'T’s is not unusual. And they never use spaces.

So we want to quickly find out how often a word, i.e., a given string, occurs in a text. More formally: given the alphabet {‘A’, ‘B’, ‘C’, …, ‘Z’} and two finite strings over that alphabet, a word W and a text T, count the number of occurrences of W in T. All the consecutive characters of W must exactly match consecutive characters of T. Occurrences may overlap.

Input

The first line of the input file contains a single number: the number of test cases to follow. Each test case has the following format:

One line with the word W, a string over {‘A’, ‘B’, ‘C’, …, ‘Z’}, with 1 ≤ |W| ≤ 10,000 (here |W| denotes the length of the string W).
One line with the text T, a string over {‘A’, ‘B’, ‘C’, …, ‘Z’}, with |W| ≤ |T| ≤ 1,000,000.

Output

For every test case in the input file, the output should contain a single number, on a single line: the number of occurrences of the word W in the text T.

Sample Input
3
BAPC
BAPC
AZA
AZAZAZA
VERDI
AVERDXIVYERDIAN
Sample Output
1
3
0

题意

输入case数,每个case由两个字符串p,s组成,求p在s中出现的次数

思路

典型的KMP板子题,非常适合和我一样初学KMP的人。

要简单的修改一下KMP模板,P的长度为len,在求解next数组的时候,需要把next [len] 也求出来,这样在匹配到模板串的最后一位的时候,让 j = next [ j ] ,跳转到下一个匹配的位置。

举个例子: A Z A

next数组为:-1 0 -1 1

匹配的AZAZAZA的时候,首先匹配到AZAZAZA,然后 j = 3 = s t r l e n ( A Z A ) j=3=strlen(AZA) j=3=strlen(AZA),此是i指向Z,因为已经匹配完成,那么下一次匹配就从next[len]开始,因为如果从0开始的话,显然是错误的。因此 j 跳转到next[len]的位置。

要注意:

  • 使用c++ STL string的时候,代码中不要用int值和string.size()比较大小,string.size()返回的是一个unsigned int,为无符号整数, ( i n t ) − 1 > ( u n s i g n e d i n t ) 1 (int)-1>(unsigned int)1 (int)1>(unsignedint)1,因为编码方式不同, ( u n s i g n e d i n t ) - 1 = 2 32 − 1 = 4294967295 (unsigned int)- 1=2^{32}-1=4294967295 (unsignedint)1=2321=4294967295

另外我本来想的是不改KMP算法,假如在s中找到p的位置为pos,就把s[pos]的字符改掉,重新走一遍KMP…然后就T掉了…初学还是理解的不够透彻啊QAQ

关于KMP算法的讲解请看大牛博客:KMP算法详解-彻底清楚了

之后有时间会写我个人的理解~

代码

#include 
#define MAXN 1000010
using namespace std;
int nex[MAXN];
char a[MAXN];
char b[MAXN];
void getNext(char *p){
     
    int j=0;
    int k=-1;
    nex[0]=-1;
    int len =strlen(p);
    while (j<len){
     
        if (k==-1 || p[j]==p[k]){
     
            if (p[++j]==p[++k]){
     
                nex[j]=nex[k];
            }else {
     
                nex[j]=k;
            }
        }else {
     
            k=nex[k];
        }
    }
}
int kmp(char *s,char *p){
     
    int len=strlen(p);
    int len2=strlen(s);
    int i=0,j=0;
    int ans=0;
    while (i< len2 && j < len){
     
        if (j==-1 || s[i]==p[j]){
     
            i++;
            j++;
        }else {
     
            j=nex[j];
        }
        if (j==len){
     
            ans++;
            j=nex[j];
        }
    }
    return ans;
}

int main (){
     
    int n;
    scanf("%d",&n);
    while (n--){
     
        scanf("%s%s",&a,&b);
        int ans=0;
        getNext(a);
        printf("%d\n",kmp(b,a));
//      原始写法( 会T )
//      int pos=kmp(b,a);
//        while (pos!=-1){
     
//            ans++;
//            b[pos]='?';
//            pos=kmp(b,a);
//        }
//      printf("%d\n",pos);
    }
    return 0;
}
KMP模板

注释掉的是C版本

#include 
#include 
#define MAXN 1000010
using namespace std;
int nex[MAXN];
//char a[MAXN];
//char b[MAXN];
void getNext(string p){
     
    int len=p.size();
//void getNext(char *p){
     
//    int len =strlen(p);
    int j=0;
    int k=-1;
    nex[0]=-1;
    while (j<len-1){
     //这里也可以写 len,看自己的情况,本题就改成了len
        if (k==-1 || p[j]==p[k]){
     
            if (p[++j]==p[++k]){
     
                nex[j]=nex[k];
            }else {
     
                nex[j]=k;
            }
        }else {
     
            k=nex[k];
        }
    }

}
//int kmp(char *s,char *p){
     
//    int len=strlen(p);
//    int len2=strlen(s);
int kmp(string s,string p){
     
int len=p.size();
int len2=s.size();
    int i=0,j=0;
    int ans=0;
    while (i< len2 && j < len){
     
        if (j==-1 || s[i]==p[j]){
     
            i++;
            j++;
        }else {
     
            j=nex[j];
        }
    }
    if (j>=len){
     
        return i-j+1;
    }else {
     
        return -1;
    }
}

int main (){
     
//    std::ios::sync_with_stdio(false);
    int n;
    string a,b;
//    while (scanf("%s%s",&a,&b)!=EOF){
     
    while (cin>>a>>b){
     
        getNext(a);
        cout<<kmp(b,a)<<endl;
//      printf("%d\n",kmp(b,a));
    }
    return 0;
}

你可能感兴趣的:(ACM训练,acm竞赛)