HDU 3336 Count the string KMP:串前缀匹配自身+DP

Count the string

Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 32768/32768 K (Java/Others)
Total Submission(s): 14842    Accepted Submission(s): 6771

 

Problem Description

It is well known that AekdyCoin is good at string problems as well as number theory problems. When given a string s, we can write down all the non-empty prefixes of this string. For example:
s: "abab"
The prefixes are: "a", "ab", "aba", "abab"
For each prefix, we can count the times it matches in s. So we can see that prefix "a" matches twice, "ab" matches twice too, "aba" matches once, and "abab" matches once. Now you are asked to calculate the sum of the match times for all the prefixes. For "abab", it is 2 + 2 + 1 + 1 = 6.
The answer may be very large, so output the answer mod 10007.

 

 

Input

The first line is a single integer T, indicating the number of test cases.
For each case, the first line is an integer n (1 <= n <= 200000), which is the length of string s. A line follows giving the string s. The characters in the strings are all lower-case letters.

 

 

Output

For each case, output only one number: the sum of the match times for all the prefixes of s mod 10007.

 

 

Sample Input

1

4

abab

 

 

Sample Output

6

 

 

Author

foreverlin@HNU

 

 

Source

HDOJ Monthly Contest – 2010.03.06

 

 

Recommend

lcy   |   We have carefully selected several similar problems for you:  1686 3746 1358 3341 2222 

 

 

Statistic | Submit | Discuss | Note

 

算法分析

题意:

给你一个字符串,求用该串所有前缀去 匹配本身这个串的次数的总和。

比如abab,它的前缀有aababaabab。那么拿这4个前缀去匹配abab自身分别有2,2,1,1个匹配点,所以总和为2+2+1+1=6

分析:

我们一开始肯定想的是拿每一个前缀去匹配,但也肯定超时,我们需要换一下思路,这里需要利用next性质,然后结合递推的概念。

根据next数组的性质,next[i]存放S[1…i-1]最大前缀和后缀相等数量,那么就可以这么想,假设在前面已经求过整体全部前缀匹配最大前缀s【1…k】的次数和,那它不就是整体全部前缀匹配最大后缀s【j-k+1…i-1]的次数和吗?(不能全部加起来,好好看看题意叫求什么和).

就如同推出next数组一样,从1开始,设dp[i]表示以str[i]结尾的字符串含有的前缀数量,

 

 dp[i] = dp[next[i]] + 1,长度小于i的前缀数量+长度为i的前缀

dp[next[i]]相当于求其前缀匹配最大后缀的数量,具体看下面的举例子

那你可能还会问中间的字符串会有影响吗?为什么是+1不是加别的?因为如果还存在一个重复子串,在next[i]位置到i位置之间,那么next[i]还会更大(矛盾)。

 

next数组的性质确保它是首尾端子串的最大匹配,所以后面不存在重复子串,除了长度为i的串本身。

举个例子:

   :匹配的前缀 数量                    nxt

a:a                                              0
ab:ab                                          0
aba:a(第三个位置的a) aba       1
abab:ab abab                             2
ababa:a ababa aba                    3
ababab:ab abab ababab            4

ans=12

代码实现

#include
#include
#include
using namespace std;
typedef long long ll; 
const int N = 200010;
int nxt[N];
char T[N];
int dp[N];
int tlen;
const int mod=10007;
void getNext()
{

    int j, k;
    j = 0; k = -1;
	nxt[0] = -1;
    while(j < tlen)
        if(k == -1 || T[j] == T[k])
           {
           	nxt[++j] = ++k;
           	if (T[j] != T[k]) //优化
				nxt[j] = k; 
           } 
        else
            k = nxt[k];
 
}
int main()
{
 
    int TT;

    cin>>TT;
    while(TT--)
    {
    	memset(nxt,0,sizeof(nxt));
    	memset(dp,0,sizeof(dp));
    	int n;
        scanf("%d %s",&n,&T);
        tlen = strlen(T);
        getNext();
        
        ll ans=0;
       
        for(int i=1;i<=n;i++)
		{
			//cout<

 

你可能感兴趣的:(字符串——KMP)