POJ3461_Oulipo_KMP_求重复子串的个数_可重叠

题意:

给母串str,和子串w,求在str中最多有几个w,w可以相互重叠

比如

str:ABABABA

w:ABA

ans=3


题解:

裸的KMP算法,只是这时候不是返回子串的位置,而是重复KMP遍历完整个串求个数


原题:

Oulipo
Time Limit: 1000MS
Memory Limit: 65536K
Total Submissions: 16626
Accepted: 6656

Description

The French author Georges Perec (1936–1982) once wrote a book, La disparition, without the letter'e'. He was a member of the Oulipo group. A quote from the book:

Tout avait Pair normal, mais tout s’affirmait faux. Tout avait Fair normal, d’abord, puis surgissait l’inhumain, l’affolant. Il aurait voulu savoir où s’articulait l’association qui l’unissait au roman : stir son tapis, assaillant à tout instant son imagination, l’intuition d’un tabou, la vision d’un mal obscur, d’un quoi vacant, d’un non-dit : la vision, l’avision d’un oubli commandant tout, où s’abolissait la raison : tout avait l’air normal mais…

Perec would probably have scored high (or rather, low) in the following contest. People are asked to write a perhaps even meaningful text on some subject with as few occurrences of a given “word” as possible. Our task is to provide the jury with a program that counts these occurrences, in order to obtain a ranking of the competitors. These competitors often write very long texts with nonsense meaning; a sequence of 500,000 consecutive'T's is not unusual. And they never use spaces.

So we want to quickly find out how often a word, i.e., a given string, occurs in a text. More formally: given the alphabet {'A','B', 'C', …, 'Z'} and two finite strings over that alphabet, a wordW and a text T, count the number of occurrences of W inT. All the consecutive characters of W must exactly match consecutive characters ofT. Occurrences may overlap.

Input

The first line of the input file contains a single number: the number of test cases to follow. Each test case has the following format:

  • One line with the word W, a string over {'A','B', 'C', …, 'Z'}, with 1 ≤ |W| ≤ 10,000 (here |W| denotes the length of the stringW).
  • One line with the text T, a string over {'A','B', 'C', …, 'Z'}, with |W| ≤ |T| ≤ 1,000,000.

Output

For every test case in the input file, the output should contain a single number, on a single line: the number of occurrences of the wordW in the text T.

Sample Input

3
BAPC
BAPC
AZA
AZAZAZA
VERDI
AVERDXIVYERDIAN
Run ID User Problem Result Memory Time Language Code Length Submit Time
11776963 chengtbf 3461 Accepted 1192K 141MS C++ 1043B 2013-07-13 17:08:24

代码:

#include
#include
#define N 1000005

char str[N],w[10005];
int next[10005];
int n;//n为母串长度
int m;//m为子串长度
int count;

void get_next()
{
	int pos=2;int cnd=0;
	next[0]=-1;next[1]=0;
	while (pos<=m)
	{
		if (w[pos-1]==w[cnd])
		{
			cnd++;
			next[pos]=cnd;
			pos++;
		}
		else if(cnd>0)
		{
			cnd=next[cnd];
		}
		else
		{
			next[pos]=0;pos++;
		}
	}
}

void search()
{
	int pos,i;
	pos=0;
	i=0;
	while (pos+i<=n-1)
	{
		if (w[i]==str[pos+i])
		{
			if (i==m-1)
			{
				count++;
				pos=pos+i-next[i];//注意这里,pos++会超时,要优化成这个,表示pos到pos+i-next[i]之间的匹配一定不成功,由next[]数组含义可得
					if(next[i]>-1)i=next[i];//这里,也要注意,我第一次就是这里TLE的,没控制好死循环了,
											//这里也跟下面i的变化一模一样,直接i=0也会超时,优化一下,因为前next[i]个字符一定已经匹配成功了,
											//直接从next[i]开始继续向后匹配

					else i=0;//但是由于next[0]=-1,所以有可能因此死循环,所以要判断一下,此时取i=0;
			}
			else
			{
				i++;
			}
		}
		else
		{
			pos=pos+i-next[i];
			if (next[i]>-1)
			{
				i=next[i];
			}
			else
			{
				i=0;
			}
		}
	}
}

int main()
{
	int t;
	char temp;
	scanf("%d",&t);
	while (t--)
	{
		count=0;
		scanf("%c",&temp);
		scanf("%s",w);
		scanf("%s",str);
		n=strlen(str);
		m=strlen(w);
		get_next();
		search();
		printf("%d\n",count);

	}
	return 0;
}



你可能感兴趣的:(KMP)