Searching the String ZOJ - 3228 (AC自动机)

Little jay really hates to deal with string. But moondy likes it very much, and she's so mischievous that she often gives jay some dull problems related to string. And one day, moondy gave jay another problem, poor jay finally broke out and cried, " Who can help me? I'll bg him! "

So what is the problem this time?

First, moondy gave jay a very long string A. Then she gave him a sequence of very short substrings, and asked him to find how many times each substring appeared in string A. What's more, she would denote whether or not founded appearances of this substring are allowed to overlap.

At first, jay just read string A from begin to end to search all appearances of each given substring. But he soon felt exhausted and couldn't go on any more, so he gave up and broke out this time.

I know you're a good guy and will help with jay even without bg, won't you?

 

Input

 

Input consists of multiple cases( <= 20 ) and terminates with end of file.

For each case, the first line contains string A ( length <= 10^5 ). The second line contains an integer N ( N <= 10^5 ), which denotes the number of queries. The next N lines, each with an integer type and a string a ( length <= 6 ), type = 0 denotes substring a is allowed to overlap and type = 1 denotes not. Note that all input characters are lowercase.

There is a blank line between two consecutive cases.

 

Output

 

For each case, output the case number first ( based on 1 , see Samples ).

Then for each query, output an integer in a single line denoting the maximum times you can find the substring under certain rules.

Output an empty line after each case.

 

Sample Input

 

ab
2
0 ab
1 ab

abababac
2
0 aba
1 aba

abcdefghijklmnopqrstuvwxyz
3
0 abc
1 def
1 jmn

 

Sample Output

 

Case 1
1
1

Case 2
3
2

Case 3
1
1
0

 

Hint

 

In Case 2,you can find the first substring starting in position (indexed from 0) 0,2,4, since they're allowed to overlap. The second substring starts in position 0 and 4, since they're not allowed to overlap.

For C++ users, kindly use scanf to avoid TLE for huge inputs.

 http://acm.zju.edu.cn/onlinejudge/showProblem.do?problemCode=3228

题目大意:给出文本串,再给出多个单词,但询问方式不同,0表示可以重叠存在的次数,1表示不可重叠存在的次数。

解题思路:把给出的单词建树,构造出AC自动机,对于询问方式,开一个二维数组保存就可以了,仅需要两列,分别表示0和1的不同询问需求。对于0可重叠的好求,每次在AC自动机上跑,遇到就加加,对于1求不可重叠次数,记录一个单词上一次在文本串中的匹配位置,那么当前单词结点的末尾在文本串中的位置减去当前单词结点在文本串中上一次匹配的位置,大于等于以当前字符结尾的单词结点的长度时,即不重叠!

注意:认真读题,数据量很大,需要用scanf和printf格式化输入输出!注意初始化,因为一个建树的结点编号没有初始化,,一直在爆栈,找了好久。。。

/*
@Author: Top_Spirit
@Language: C++
*/
#include 
using namespace std ;
typedef unsigned long long ull ;
typedef long long ll ;
typedef pair < int, int > P ;
const int Maxn = 6e5 + 10 ;
const int INF = 0x3f3f3f3f ;
const double PI = acos(-1.0) ;
const ull seed = 133 ;
const int _Max = 1e5 + 10 ;

char str[_Max], s[_Max] ;
int op[_Max], node[_Max], n ;
int Next[Maxn][26], fail[Maxn], pos[Maxn], cnt ;
int ans[Maxn][2], last[Maxn] ;

void Insert(char *s , int index){
    int root = 0 ;
    for (int i = 0; s[i]; i++){
        int id = s[i] - 'a' ;
        if (!Next[root][id]) Next[root][id] = ++cnt ;
        root = Next[root][id] ;
        pos[root] = i + 1 ;
    }
    node[index] = root ;
}

void getFail (){
    queue < int > que ;
    for (int i = 0; i < 26; i++){
        if (Next[0][i]){
            fail[Next[0][i]] = 0 ;
            que.push(Next[0][i]) ;
        }
    }
    while (!que.empty()){
        int tmp = que.front() ;
        que.pop() ;
        for (int i = 0; i < 26; i++){
            if (Next[tmp][i]){
                fail[Next[tmp][i]] = Next[fail[tmp]][i] ;
                que.push(Next[tmp][i]) ;
            }
            else Next[tmp][i] = Next[fail[tmp]][i] ;
        }
    }
}

void query (char *s){
    int root = 0,p = root ;
    for (int i = 0; s[i]; i++){
        int id = s[i] - 'a' ;
        while (p != root && !Next[p][id]) p = fail[p] ;
        p = Next[p][id] ;
        if (!p) p = root ;
        int tmp = p ;
        while (tmp != root){
            ans[tmp][0]++ ;
            if (i - last[tmp] >= pos[tmp]){
                ans[tmp][1]++ ;
                last[tmp] = i ;
            }
            tmp = fail[tmp] ;
        }
    }
}

int main (){
    int Cas = 0 ;
    while (~scanf("%s", str)){
//        cin >> n ;
        scanf("%d", &n) ;
        cnt = 0 ;
        memset(Next, 0, sizeof(Next)) ;
//        memset(fail, 0, sizeof(fail)) ;
//        memset(pos, 0, sizeof(pos)) ;
        for (int i = 0; i < n; i++){
//            cin >> op[i] >> s ;
            scanf("%d%s", &op[i], s) ;
            Insert(s, i) ;
        }
        getFail () ;
        memset(ans, 0, sizeof(ans)) ;
        memset(last, -1, sizeof(last)) ;
        query(str) ;
//        cout << "Case " << ++Cas << endl ;
        printf("Case %d\n", ++Cas) ;
        for (int i = 0; i < n; i++){
//            cout << ans[node[i]][op[i]] << endl ;
            printf("%d\n", ans[node[i]][op[i]]) ;
        }
        puts("") ;
    }
    return 0 ;
}

对于写这类AC自动机的题,个人感觉要考虑周全,毕竟代码量稍微多一点,调BUG是很头疼的!

你可能感兴趣的:(AC自动机,strings)