HDU 2222 Keywords Search

Keywords Search

Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 131072/131072 K (Java/Others)

Problem Description

In the modern time, Search engine came into the life of everybody like Google, Baidu, etc.
Wiskey also wants to bring this feature to his image retrieval system.
Every image have a long description, when users type some keywords to find the image, the system will match the keywords with description of image and show the image which the most keywords be matched.
To simplify the problem, giving you a description of image, and some keywords, you should tell me how many keywords will be match.

Input

First line will contain one integer means how many cases will follow by.
Each case will contain two integers N means the number of keywords and N keywords follow. (N <= 10000)
Each keyword will only contains characters ‘a’-‘z’, and the length will be not longer than 50.
The last line is the description, and the length will be not longer than 1000000.

Output

Print how many keywords are contained in the description.

Sample Input

1
5
she
he
say
shr
her
yasherhs

Sample Output

3

题意:

给出几个模式串,问你再文本串中这些模式串出现了几个。

思路:

ac自动机的模板题,首先要理解先建立一个trie树来存储这些模式串,然后根据trie树来建立fail树,用fail树来求取最终值。
①建立trie树,因为是由模式串求出trie树的,所以也就是说trie树是存储模式串,所以将每个模式串都遍历一遍,每个单词都进行存储,每一个单词代表一个结点,最终结点串在一起就是trie树了, 最后一个单词进行标记,表示到这个结点为一个单词。
②ac自动机最难的应该就是fail树的建立了,fail树的建立实际上就是将相同前缀的连在一起,比如两段root-s-h-e, 和root-h-e-r,这样he是相同的,这样就可以连在一起,但是这里单词数却不能加一,因为her的标记再r上面,假如是he的话就可以了,这样可以有效减少字典树的遍历(具体做法请看代码)。

#include 
#include 
#include 
#include 
#include 
using namespace std;
const int maxn = 500010;
struct Trie {
    int root;
    int L;
    int Next[maxn][26];
    int fail[maxn];
    int End[maxn];
    int newnode() {
        for (int i = 0; i < 26; i++) Next[L][i] = -1;
        End[L++] = 0;
        return L - 1;
    }
    void init() {
        L = 0;
        root = newnode();
    }
    void Insert(char *buf) {
        int len = strlen(buf), now = root;
        for (int i = 0; i < len; i++) {
            if (Next[now][(int)(buf[i] - 'a')] == -1) Next[now][(int)(buf[i] - 'a')] = newnode();
            now = Next[now][(int)(buf[i] - 'a')];
        }
        End[now]++;
    }
    void build() {
        queue q;
        fail[root] = root;
        for (int i = 0; i < 26; i++) {
            if (Next[root][i] == -1) Next[root][i] = root;
            else {
                fail[Next[root][i]] = root;
                q.push(Next[root][i]);
            }
        }
        while (!q.empty()) {
            int now = q.front();
            q.pop();
            for (int i = 0; i < 26; i++) {
                if (Next[now][i] == -1) Next[now][i] = Next[fail[now]][i];
                else {
                    fail[Next[now][i]] = Next[fail[now]][i];
                    q.push(Next[now][i]);
                }
            }
        }
    }
    int query(char *buf) {
        int now = root, len = strlen(buf), res = 0;
        for (int i = 0; i < len; i++) {
            now = Next[now][(int)(buf[i] - 'a')];
            int temp = now;
            while (temp != root) {
                res += End[temp];
                End[temp] = 0;
                temp = fail[temp];
            }
        }
        return res;
    }
};
Trie ac;
char buf[maxn * 2];
int main() {
    int n, t;
    scanf("%d", &t);
    while (t--) {
        scanf("%d", &n);
        ac.init();
        for (int i = 0; i < n; i++) {
            scanf("%s", buf);
            ac.Insert(buf);
        }
        ac.build();
        scanf("%s", buf);
        printf("%d\n", ac.query(buf));
    }
    return 0;
}

你可能感兴趣的:(HDU)