poj 2778(AC自动机+矩阵)

DNA Sequence
Time Limit: 1000MS   Memory Limit: 65536K
Total Submissions: 14255   Accepted: 5498

Description

It's well known that DNA Sequence is a sequence only contains A, C, T and G, and it's very useful to analyze a segment of DNA Sequence,For example, if a animal's DNA sequence contains segment ATC then it may mean that the animal may have a genetic disease. Until now scientists have found several those segments, the problem is how many kinds of DNA sequences of a species don't contain those segments. 

Suppose that DNA sequences of a species is a sequence that consist of A, C, T and G,and the length of sequences is a given integer n. 

Input

First line contains two integer m (0 <= m <= 10), n (1 <= n <=2000000000). Here, m is the number of genetic disease segment, and n is the length of sequences. 

Next m lines each line contain a DNA genetic disease segment, and length of these segments is not larger than 10. 

Output

An integer, the number of DNA sequences, mod 100000.

Sample Input

4 3
AT
AC
AG
AA

Sample Output

36


转载标记处:http://blog.csdn.net/morgan_xww/article/details/7834801

题意:有m种DNA序列是有疾病的,问有多少种长度为n的DNA序列不包含任何一种有疾病的DNA序列。(仅含A,T,C,G四个字符)

这个和矩阵有什么关系呢???
poj 2778(AC自动机+矩阵)_第1张图片
•上图是例子{“ACG”,”C”},构建trie图后如图所示,从每个结点出发都有4条边(A,T,C,G)
•从状态0出发走一步有4种走法:
  –走A到状态1(安全);
  –走C到状态4(危险);
  –走T到状态0(安全);
  –走G到状态0(安全);
•所以当n=1时,答案就是3
•当n=2时,就是从状态0出发走2步,就形成一个长度为2的字符串,只要路径上没有经过危险结点,有几种走法,那么答案就是几种。依此类推走n步就形成长度为n的字符串。
•建立trie图的邻接矩阵M:

2 1 0 0 1

2 1 1 0 0

1 1 0 1 1

2 1 0 0 1

2 1 0 0 1

M[i,j]表示从结点i到j只走一步有几种走法。

那么M的n次幂就表示从结点i到j走n步有几种走法。

注意:危险结点要去掉,也就是去掉危险结点的行和列。结点3和4是单词结尾所以危险,结点2的fail指针指向4,当匹配”AC”时也就匹配了”C”,所以2也是危险的。

矩阵变成M:

2 1

2 1

计算M[][]的n次幂,然后 Σ(M[0,i]) mod 100000 就是答案。

由于n很大,可以使用二分来计算矩阵的幂


#include<iostream>
#include<cstdio>
#include<cstring>
#include<queue>
using namespace std;

const int maxn = 105;
const int col = 4;
typedef long long MATRIX[maxn][maxn];
MATRIX mat,m1,m2;


struct ACAutomation
{
	int n;					//记录当前节点总数
	int id['Z'];			//当前字母对应的节点编号
	int fail[maxn];			//每个节点的fail指针
	int trie[maxn][col];	//trie树
	bool tail[maxn];        //判断是否为单词的结尾

	void init()
	{
		id['A'] = 0;
		id['C'] = 1;
		id['T'] = 2;
		id['G'] = 3;
		memset(trie,-1,sizeof(trie));
		memset(tail,false,sizeof(tail));
		n = 1;
	}

	void add(char *s)
	{
		int p = 0;
		while(*s)
		{
			int idx = id[*s];
			if(trie[p][idx] == -1)
			{
				trie[p][idx] = n++;
			}
			p = trie[p][idx];
			s++;
		}
		tail[p] = true;
	}

	void build_ac()
	{
		queue<int> q;
		fail[0] = 0;
		for(int i = 0; i < col; i++)
		{
			if(trie[0][i] != -1)
			{
				fail[trie[0][i]] = 0;
				q.push(trie[0][i]);
			}
			else trie[0][i] = 0;
		}
		while(!q.empty())
		{
			int u = q.front();
			q.pop();
			if(tail[fail[u]])	//如果该节点的失败指针所指的节点属于“危险”节点
				tail[u] = true;		//那么从根节点到该节点的字符串肯定包含了“病毒”
			for(int i = 0; i < col; i++)
			{
				int v = trie[u][i];
				if(v == -1)
				{
					trie[u][i] = trie[fail[u]][i];
				}
				else
				{
					fail[v] = trie[fail[u]][i];
					q.push(v);
				}
			}
		}
	}

	void build_matrix()
	{
		memset(mat,0,sizeof(mat));
		for(int i = 0; i < n; i++)
			for(int j = 0; j < col; j++)
			{
				if(!tail[i] && !tail[trie[i][j]])
					mat[i][trie[i][j]]++;
			}
	}
}AC;

void matrixMult(MATRIX t1, MATRIX t2, MATRIX res)  
{  
    for (int i = 0; i < AC.n; i++)  
        for (int j = 0; j < AC.n; j++)  
        {  
            res[i][j] = 0;  
            for (int k = 0; k < AC.n; k++)  
            {  
                res[i][j] += t1[i][k] * t2[k][j];  
            }  
            res[i][j] %= 100000;  
        }  
}  

void matrixPower(int p)  
{  
    if (p == 1)  
    {  
        for (int i = 0; i < AC.n; i++)  
            for (int j = 0; j < AC.n; j++)  
                m2[i][j] = mat[i][j];  
        return;  
    }  
  
    matrixPower(p/2);          //计算矩阵的p/2次幂,结果存在m2[][]  
    matrixMult(m2, m2, m1);    //计算矩阵m2的平方,结果存在m1[][]  
  
    if (p % 2)                 //如果p为奇数,则再计算矩阵m1乘以原矩阵mat[][],结果存在m2[][]  
        matrixMult(m1, mat, m2);  
    else  
        swap(m1, m2);  
}  

int main()
{
	int n,m;
	char s[12];
	AC.init();
	cin >> m >> n;
	while(m--)
	{
		cin >> s;
		AC.add(s);
	}
	AC.build_ac();
	AC.build_matrix();
	memset(m1,0,sizeof(m1));
	memset(m2,0,sizeof(m2));
	matrixPower(n);
	int ans = 0;
	for(int i = 0; i < AC.n; i++)
		ans = (ans + m2[0][i]) % 100000;
	printf("%d\n",ans);
	return 0;
}


你可能感兴趣的:(AC自动机)