2019 ICPC WorldFinal G.First of Her Name(trie上建sam/ac自动机)

In the Royal Family, names are very important! As the Royal Historian you have been charged with analyzing the patterns in the names of the Royal Ladies in the realm.

There have been nn Royal Ladies, for convenience numbered from 11 to nn. The name of each Lady is an uppercase letter concatenated with the name of her mother. The exception is the Lady numbered 11, the founder of the Royal Family, whose name is just a single uppercase letter.

For example, ENERYS could be the mother of AENERYS (as the name AENERYS consists of the single uppercase letter ‘A’ concatenated with ENERYS, which is her mother’s name). Similarly, AENERYS could be the mother of DAENERYS and YAENERYS.

You are given the description of all the Royal Ladies. Your task is to determine, for certain interesting strings ss, the number of Royal Ladies for whom ss is a prefix of their name.

For example, consider Sample Input 1 below, with a Royal Line that goes straight from the founder S to AENERYS (through YS, RYS, ERYS, NERYS and ENERYS), with each Lady having exactly one daughter. Then AENERYS has two daughters—DAENERYS and YAENERYS, with the latter having one daughter, RYAENERYS.

In such a family, RY is a prefix of the names of two ladies: RYS and RYAENERYS. E is a prefix of the names of ERYS and ENERYS. N is a prefix only of NERYS’s name, while S is a prefix only of the name of the founder, S. AY is not a prefix of any Royal Lady’s name.

Input
The first line of input contains two integers nn and kk, where nn (1≤n≤1061≤n≤106) is the total number of Royal Ladies and kk (1≤k≤1061≤k≤106) is the number of query strings.

Then follow nn lines describing the Royal Ladies. The ithith of these lines describes the Royal Lady numbered ii, and contains an uppercase letter cici (‘A’–‘Z’) and an integer pipi, where cici is the first letter of the name of Lady ii, and pipi (p1=0p1=0 and 1≤pi1i>1) is the number of her mother (or 00, in the case of the First Lady). All the names are unique.

The remaining kk lines each contain one nonempty query string, consisting only of uppercase letters. The sum of the lengths of the query strings is at most 106106.

Output
Output kk lines, with the ithith line containing the number of Royal Ladies who have the ithith query string as a prefix of their name.

Sample Input 1
10 5
S 0
Y 1
R 2
E 3
N 4
E 5
A 6
D 7
Y 7
R 9
RY
E
N
S
AY

Sample Output 1
2
2
1
1
0

输入一颗倒序建的字典树和k个查询,每个查询输入一个字符串,问该字符串是字典树中存的多少个串的前缀。

一类模板题,用trie树离线建广义后缀自动机,参见2015国家集训队论文《后缀自动机在字典树上的拓展》,刘研绎。

模板题是BZOJ 3926 众神眷恋的幻想乡(by陈立杰)。

离线构建字典树后bfs字典树,对于每个字典树节点记录该节点的sam上last(普通sam只有一个last,这里需要字典树节点数的last数组),字典树的每个分支单独顺着一条sam分支建自动机。

在线的建法是每次加串都将last置1,从头开始往后加新串,并且压缩重复节点。

这其实相当于dfs字典树的建法,dfs的建法和bfs的区别在于dfs一搜到底,会把一个很长的字符串一口气建入自动机,使得sam的parent树变得复杂,新增节点时会多很多重建link的操作,bfs则复杂度比较平均。

用字典树建好sam之后就是基操了,基数排序后处理出right集合大小,再将查询串倒序放在sam上跑(因为题目输入的字典树是倒序的),终止位置的right集合大小就是查询答案,如果中途失配,答案就是0。

4.17日更新:

我昨天补完了ac自动机的解法。
我发现我竟然是第一次写这种跟fail树有关的ac自动机…以前都是跑trie图的。

思考了大概10分钟就拍脑袋上去写代码了,之后就卡了一整天。

思路是离线用所有查询串倒序建一个ac自动机,在每个查询串的终止节点记录当前查询的id。

之后dfs整颗字典树(相当于把字典树表示的所有串取出),一边dfs一边同时跑ac自动机,由于在建fail边的时候已经把ac自动机改成了trie图,所以一直next就完事了,自动机上路过的节点全部计数+1,并且向上一直更新fail树上的祖先(当前位置匹配完成,意味着父节点作为与当前表示后缀的最长公共前缀同样匹配完成)。

我在敲自动机的板子的时候忽然意识到每个节点都要向上更新可(yi)能(ding)会复杂度感人,怕是要喜得tle,这时候类比后缀自动机先对每个节点单独计数最后再按拓扑序更新的操作,忽然就有了解决办法,写出了这样的代码:

for (int i = sz; i > root; --i) {
    cnt[fail[i]] += cnt[i];
}

之后竟然就过了样例。交,喜获wa,只过了4个测试点。
思考之后发现如果查询串出现重复,记录的id一定会相互覆盖影响答案,于是把id从数组改成vector,交,第五个测试点过了,在第8个喜获wa。

到这了还算顺利,这之后我就卡了一整天。

由于感觉自己的思路没什么问题,一直没找出wa点,之后放弃治疗写了一发之前提到的每个节点都暴力向上更新,交之后竟然过到了第九个测试点,在第10个预料之中的tle了,这让我意识到应该是更新操作写出了问题。

于是开始百度ac自动机寻找类似的题,然后就看到了巨佬的博客bzoj3172 [Tjoi2013]单词(AC自动机+fail树)。
我发现,oh,这原来又又又是基操。

我发现我错误的把字典树的拓扑序(从后往前即可保证拓扑)和fail树的拓扑序搞混了,上面这篇博客中,巨佬用手写的队列数组代替stl队列,用头尾指针移动来模拟入队出队,在宽搜完成之后,顺手就获得了一个装着fail树拓扑序的队列…

还有这种操作?

于是我改了改我写的第二份代码,AC了…还真就差这一步。

ac自动机的跑这题的运行时间还不到sam的一半,常数真可怕。

sam:

#include

using namespace std;
typedef long long ll;
const int maxn = 1e6 + 5;

int n, k, fa;
char name[maxn];
string q;
vector<int> trie[maxn];

struct Sam {
    int next[maxn << 1][26];
    int link[maxn << 1], step[maxn << 1];
    ll endpos[maxn << 1];
    int last[maxn];
    int a[maxn], b[maxn << 1];
    int sz,root;

    int add(int p, int c) {
//        int p = last;
        int np = ++sz;
//        last = np;

        endpos[np] = 1;
        step[np] = step[p] + 1;
        while (!next[p][c] && p) {
            next[p][c] = np;
            p = link[p];
        }

        if (p == 0) {
            link[np] = root;
        } else {
            int q = next[p][c];
            if (step[p] + 1 == step[q]) {
                link[np] = q;
            } else {
                int nq = ++sz;
                memcpy(next[nq], next[q], sizeof(next[q]));
                step[nq] = step[p] + 1;
                link[nq] = link[q];
                link[q] = link[np] = nq;
                while (next[p][c] == q && p) {
                    next[p][c] = nq;
                    p = link[p];
                }
            }
        }
        return np;
    }


    void init() {
        //如多次建立自动机,加入memset操作
        root = sz = 1;
    }

    void build() {
        init();

        queue<int> q;
        q.push(0), last[0] = root;
        while (!q.empty()) {
            int p = q.front();
            q.pop();

            for (int i = 0; i < trie[p].size(); ++i) {
                last[trie[p][i]] = add(last[p], name[trie[p][i]] - 'A');
                q.push(trie[p][i]);
            }
        }

        for (int i = 1; i <= sz; i++) {
            a[step[i]]++;
        }
        for (int i = 1; i <= sz; i++) {
            a[i] += a[i - 1];
        }
        for (int i = 1; i <= sz; i++) {
            b[a[step[i]]--] = i;
        }
        for (int i = sz; i > root; --i) {
            int e = b[i];
            endpos[link[e]] += endpos[e];
        }
    }

    void solve() {
        int p = root;
        for (int i = q.length() - 1, c; i >= 0; --i) {
            c = q[i] - 'A';
            if (!next[p][c]) {
                p = 0;
                break;
            }
            p = next[p][c];
        }
        cout << endpos[p] << '\n';
    }

} sam;


int main() {
    ios::sync_with_stdio(0);
    cin >> n >> k;

    //AENERYS
    //SYRENEAYR
    for (int i = 1; i <= n; ++i) {
        cin >> name[i] >> fa;
        trie[fa].push_back(i);
    }

    sam.build();
    while (k--) {
        cin >> q;
        sam.solve();
    }
    return 0;
}

ac自动机:

#include

using namespace std;
typedef long long ll;
const int maxn = 1e6 + 5;

int n, k, fa;
char name[maxn];
string q;
vector<int> trie[maxn];

struct AC_Automaton {
    int next[maxn][26];
    int fail[maxn];
    vector<int> id[maxn];
    ll cnt[maxn];
    ll ans[maxn];
    int que[maxn], qt, qh;
    int sz, root;

    void init() {
        qt = 0, qh = 1;
        root = sz = 1;
        memset(next, -1, sizeof(next));
    }

    void add(int x) {
        int p = root, c;
        for (int i = q.length() - 1; i >= 0; --i) {
            c = q[i] - 'A';
            if (next[p][c] == -1) {
                next[p][c] = ++sz;
            }
            p = next[p][c];
        }
        id[p].push_back(x);
    }

    void getFail() {
        for (int i = 0; i < 26; i++) {
            if (~next[root][i]) {
                fail[next[root][i]] = root;
                que[++qt] = next[root][i];
            } else {
                next[root][i] = root;
            }
        }
        while (qh <= qt) {
            int p = que[qh++];
            for (int i = 0; i < 26; i++) {
                if (~next[p][i]) {
                    fail[next[p][i]] = next[fail[p]][i];
                    que[++qt] = next[p][i];
                } else {
                    next[p][i] = next[fail[p]][i];
                }
            }
        }
    }

    void dfs(int x, int p) {
        ++cnt[p];
        for (int i = 0; i < trie[x].size(); ++i) {
            dfs(trie[x][i], next[p][name[trie[x][i]] - 'A']);
        }
    }

    void solve() {
        getFail();
        dfs(0, root);

        for (int i = sz; i > root; --i) {
            int e = que[i];
            cnt[fail[e]] += cnt[e];
        }

        for (int i = root; i <= sz; ++i) {
            for (int j = 0; j < id[i].size(); ++j) {
                ans[id[i][j]] = cnt[i];
            }
        }
        for (int i = 1; i <= k; ++i) {
            cout << ans[i] << '\n';
        }
    }

} ac;

int main() {
    ios::sync_with_stdio(0);
    cin >> n >> k;

    //AENERYS
    //SYRENEAYR
    for (int i = 1; i <= n; ++i) {
        cin >> name[i] >> fa;
        trie[fa].push_back(i);
    }

    ac.init();
    for (int i = 1; i <= k; ++i) {
        cin >> q;
        ac.add(i);
    }
    ac.solve();
    return 0;
}

你可能感兴趣的:(ACM,字符串,后缀自动机)