【后缀数组与统计】

后缀数组向来很强大,其中它的统计功能是一方面。

下面以两道题目为例说明一下如何使用强大的后缀数组进行统计

【例1】POJ 3415http://poj.org/problem?id=3415

这题是求两串中长度>=K的子串的个数(可重复)

做法:首先最正路的做法就是用后缀数组把两个串链接起来,中间加个没出现过的字符,对新串求heigh数组,我们先对遇到A的可以对前面与B的lcp进行相加,同样遇到属于B的也对前面与A的lcp进行相加,现在问题是怎么相加,O(n^2)算法明显不可行?有个东东叫”单调栈“,顾名思义,它的元素是单调的。单调栈的最常见用途是求子矩阵最大面积,现在也类似的,因为任意两个后缀的lcp是区间的最小值。更具体可以参考这个blog http://www.cnblogs.com/Booble/archive/2010/12/14/1906147.html,顺便提一点,我们可以用一个数组保存当前lcp最小值为h的区间数,可以在栈操作的同时进行更新。

#define maxn 200100
int wa[maxn],wb[maxn],wv[maxn],wss[maxn];
int r[maxn],sa[maxn];
int cmp(int *r,int a,int b,int l)
{return r[a]==r[b] && r[a+l]==r[b+l];}
void da(int *r,int *sa,int n,int m){
     int i,j,p,*x=wa,*y=wb,*t;
     for(i=0;i=0;i--) sa[--wss[x[i]]]=i;
     for(j=1,p=1;p=j) y[p++]=sa[i]-j;
         for(i=0;i=0;i--) sa[--wss[wv[i]]]=y[i];
         for(t=x,x=y,y=t,p=1,x[sa[0]]=0,i=1;i=k) ? height[i]-k+1 : 0;
        }
        LL ans = 0;
        st[0] = -1,fa[len+1] = 0;
        for(j=0;j<=1;j++){
            LL sum = 0;
            for(int top = 0,i=2;i<=len;i++){
                if(fb[i]!=j)ans+=sum;
                st[++top] = fa[i+1];
                fc[top] = (fb[i]==j);
                sum += (LL)st[top]*(LL)fc[top];
                while(st[top-1]>=st[top]){
                    sum -= (LL)(st[top-1]-st[top])*(LL)fc[top-1];
                    st[top-1] = st[top];
                    fc[top-1] += fc[top];//合并区间
                    top--;
                }
            }
        }
        printf("%I64d\n",ans);
    }
    return 0;
}

【例2】

E. Prefix Sum

Time Limit : 6000/3000ms (Java/Other)   Memory Limit : 65535/32768K (Java/Other)
Total Submission(s) : 88   Accepted Submission(s) : 14

Font: Times New Roman | Verdana | Georgia

Font Size:  

Problem Description

A string v is a suffix string of a string w if string v can read from a position of string w and to the end of w.
For example, string bc is a suffix string of abc. but ab is not.
A string v is a prefix string of a string w if string v can read from the beginning of string w.
For example, string ab is prefix string of string abc, but bc and abcd are not.

For 2 strings s1 and s2, if there is a string s3 is both the prefix of s1 and s2, we call s3 is a common prefix of s1 and s2.
The longest common prefix of 2 strings is the longest common prefix string of all the common prefix strings among these 2 strings.

Your task is:
Give you the string, count the sum of the length of each of the longest common prefix string of each 2 suffix of the string.

Input

There are multi strings. One string per line. Each string is no longer than 10^5. The strings only contain A-Z and a-z.

Output

For each string, output the sum.

Sample Input

ABC
ABABA
AABB

Sample Output

0
7
2

Source

SCAUCPC 2012


这是华农校赛一道”难题“,其实不难,用后缀数组完全可做。

题意很简单:给出一个串,计算两两后缀的最长前缀的长度,并求和。 

方法:后缀数组+单调栈优化

#define maxn 100100
int wa[maxn],wb[maxn],wv[maxn],wss[maxn];
int r[maxn],sa[maxn];
int cmp(int *r,int a,int b,int l)
{return r[a]==r[b] && r[a+l]==r[b+l];}
void da(int *r,int *sa,int n,int m){
     int i,j,p,*x=wa,*y=wb,*t;
     for(i=0;i=0;i--) sa[--wss[x[i]]]=i;
     for(j=1,p=1;p=j) y[p++]=sa[i]-j;
         for(i=0;i=0;i--) sa[--wss[wv[i]]]=y[i];
         for(t=x,x=y,y=t,p=1,x[sa[0]]=0,i=1;i=st[top]){
                st[top-1] = st[top];
                c[top-1] += c[top];
                top--;
            }
            ans += c[top]*st[top]+dp[top-1];
            dp[top] = dp[top-1]+c[top]*st[top];
        }
        printf("%I64d\n",ans);
    }
    return 0;
}



















你可能感兴趣的:(字符串,数据结构,string,ini,each,output,算法,c)