Lost's revengeTime Limit: 15000/5000 MS (Java/Others) Memory Limit: 65535/65535 K (Java/Others)Total Submission(s): 3452 Accepted Submission(s): 932
Problem Description
Lost and AekdyCoin are friends. They always play "number game"(A boring game based on number theory) together. We all know that AekdyCoin is the man called "nuclear weapon of FZU,descendant of Jingrun", because of his talent in the field of number theory. So Lost had never won the game. He was so ashamed and angry, but he didn't know how to improve his level of number theory.
One noon, when Lost was lying on the bed, the Spring Brother poster on the wall(Lost is a believer of Spring Brother) said hello to him! Spring Brother said, "I'm Spring Brother, and I saw AekdyCoin shames you again and again. I can't bear my believers were being bullied. Now, I give you a chance to rearrange your gene sequences to defeat AekdyCoin!". It's soooo crazy and unbelievable to rearrange the gene sequences, but Lost has no choice. He knows some genes called "number theory gene" will affect one "level of number theory". And two of the same kind of gene in different position in the gene sequences will affect two "level of number theory", even though they overlap each other. There is nothing but revenge in his mind. So he needs you help to calculate the most "level of number theory" after rearrangement.
Input
There are less than 30 testcases.
For each testcase, first line is number of "number theory gene" N(1<=N<=50). N=0 denotes the end of the input file. Next N lines means the "number theory gene", and the length of every "number theory gene" is no more than 10. The last line is Lost's gene sequences, its length is also less or equal 40. All genes and gene sequences are only contains capital letter ACGT.
Output
For each testcase, output the case number(start with 1) and the most "level of number theory" with format like the sample output.
Sample Input
3 AC CG GT CGAT 1 AA AAA 0
Sample Output
Case 1: 3 Case 2: 2
|
题意:n个模式串和一个文本串,可以重排文本串。问你重排后的文本串中最多含有多少个模式串(可以重叠)。
思路:明显的AC自动机 + 状态压缩dp,题目很坑,开5维dp数组MLE。
参考了大牛的状态压缩思想
题解:利用模式串建立trie图,trie图上最多有500个结点( N*10 ),然后朴素的思想就是用S(i, iA, iC, iG, iT)表示在i状态下,拥有iA个A、iC个C、iG个G、iT个T的串拥有的最多的模式串的个数,但是iA, iC, iG, iT的取值均是[0, 40],所以我们需要把状态压缩一下,我们知道当四种字符都取10的时候可以让状态数达到最大,即114 = 14641, 所以可以令MaxA、MaxC、MaxG、MaxT分别表示四种字符出现的个数,那么T字符的权值为1,G字符的权值为(MaxT + 1),C字符的权值为(MaxG + 1) *(MaxT + 1),A字符的权值为(MaxC + 1) *(MaxG + 1) *(MaxT + 1),进行进制压缩之后总的状态数不会超过114,可以用DP[i][j]表示在trie的i号结点时ACGT四个字符个数的压缩状态为j时的字符串包含模式串的最多数目,然后就是进行O(4*500*114)的状态转移了。
会状态压缩这道题基本就可以AC了。
用dp[i][j]表示i状态下在Trie的j节点上拥有的最多模式串个数。用word[]记录当前节点的模式串数目。
设当前状态为State,当前节点为j。下一个状态为nextState,下一个节点nextnode。
状态转移方程dp[nextState][nextnode] = max(dp[nextState][nextnode], dp[State][j] + word[nextnode]).
AC代码:注意临界条件的判断
#include
#include
#include
#include
#define MAXN 510
#define INF 0x3f3f3f3f
#define debug printf("1\n");
using namespace std;
int k = 1;
struct Trie
{
int next[MAXN][4], fail[MAXN], word[MAXN];
int L, root;
int newnode()
{
for(int i = 0; i < 4; i++)
next[L][i] = -1;
word[L++] = 0;
return L-1;
}
void init()
{
L = 0;
root = newnode();
}
int getval(char c)
{
if(c == 'A') return 0;
if(c == 'C') return 1;
if(c == 'G') return 2;
if(c == 'T') return 3;
}
void Insert(char *s)
{
int now = root;
for(int i = 0; s[i]; i++)
{
if(next[now][getval(s[i])] == -1)
next[now][getval(s[i])] = newnode();
now = next[now][getval(s[i])];
}
word[now]++;
}
void Build()
{
queue Q;
fail[root] = root;
for(int i = 0; i < 4; i++)
{
if(next[root][i] == -1)
next[root][i] = root;
else
{
fail[next[root][i]] = root;
Q.push(next[root][i]);
}
}
while(!Q.empty())
{
int now = Q.front();
Q.pop();
word[now] += word[fail[now]];
for(int i = 0; i < 4; i++)
{
if(next[now][i] == -1)
next[now][i] = next[fail[now]][i];
else
{
fail[next[now][i]] = next[fail[now]][i];
Q.push(next[now][i]);
}
}
}
}
int num[4], Hash[4];
int dp[15000][MAXN];//i状态时在Trie的j节点最大的数论水平
void solve(char *s)
{
memset(num, 0, sizeof(num));
for(int i = 0; s[i]; i++)
num[getval(s[i])]++;
Hash[3] = 1;//T字符权值
Hash[2] = num[3] + 1;//G字符权值
Hash[1] = (num[2] + 1) * (num[3] + 1);//C字符权值
Hash[0] = (num[1] + 1) * (num[2] + 1) * (num[3] + 1);//A字符权值
memset(dp, -INF, sizeof(dp));
dp[0][0] = 0;
for(int A = 0; A <= num[0]; A++)
for(int C = 0; C <= num[1]; C++)
for(int G = 0; G <= num[2]; G++)
for(int T = 0; T <= num[3]; T++)
{
int State = A * Hash[0] + C * Hash[1] + G * Hash[2] + T * Hash[3];
for(int j = 0; j < L; j++)
{
if(dp[State][j] == -INF) continue;
for(int k = 0; k < 4; k++)
{
if(k == 0 && A == num[0]) continue;//注意临界条件的判断
if(k == 1 && C == num[1]) continue;
if(k == 2 && G == num[2]) continue;
if(k == 3 && T == num[3]) continue;
int nextnode = next[j][k];
int nextState = State + Hash[k];
dp[nextState][nextnode] = max(dp[nextState][nextnode], dp[State][j] + word[nextnode]);
}
}
}
int SumState = num[0] * Hash[0] + num[1] * Hash[1] + num[2] * Hash[2] + num[3] * Hash[3];
int ans = 0;
for(int i = 0; i < L; i++)
ans = max(ans, dp[SumState][i]);
printf("Case %d: %d\n", k++, ans);
}
};
Trie ac;
char str[50];
int main()
{
int n;
while(scanf("%d", &n), n)
{
ac.init();
for(int i = 0; i < n; i++)
scanf("%s", str), ac.Insert(str);
ac.Build();
scanf("%s", str);
ac.solve(str);
}
return 0;
}