字符串压缩,转换成求哈密顿通路。
问题
Problem
You've invented a slight modification of the run-length encoding (RLE) compression algorithm, called PermRLE.
To compress a string, this algorithm chooses some permutation of integers between 1 and k, applies this permutation to the first k letters of the given string, then to the next block of k letters, and so on. The length of the string must be divisible by k. After permuting all blocks, the new string is compressed using RLE, which is described later.
To apply the given permutation p to a block of k letters means to place the p[1]-th of these letters in the first position, then p[2]-th of these letters in the second position, and so on. For example, applying the permutation {3,1,4,2} to the block "abcd" yields "cadb". Applying it to the longer string "abcdefghijkl" in blocks yields "cadbgehfkilj".
The permuted string is then compressed using run-length encoding. To simplify, we will consider the compressed size of the string to be the number of groups of consecutive equal letters. For example, the compressed size of "aabcaaaa" is 4; the first of the four groups is a group of two letters "a", then two groups "b" and "c" each containing only one letter, and finally a longer group of letters "a".
Obviously, the compressed size may depend on the chosen permutation. Since the goal of compression algorithms is to minimize the size of the compressed text, it is your job to choose the permutation that yields the smallest possible compressed size, and output that size.
分析
Section A. The Hamiltonian cycle in a small world
A Hamilton cycle in a graph is a cycle that visits each node exactly once. Given a weighted, directed, complete graph on n nodes, there are (n-1)! distinct Hamiltonian cycles. It is well known that the problem of finding the shortest (or longest) Hamilton cycle is NP-hard. It is also known to many contestants that, for n as small as 20, dynamic programming makes a difference of n*2n vs. n!, which is the difference between a second and an eternity.
Let's have a look at the n*2n DP trick, in case you have not seen it before.
Without loss of generality, we may view node 0 as the start point of the cycle, as well as its end point. For any subset A of the node set V and any node x in A, we define
dp[A][x] := The shortest path that starts from x, visits each point in A exactly once and ends up at node 0. (*)
To clarify, 0 does not necessary belong to A, but we do count the length of the edge from the last point to node 0. Thus the problem of finding the shortest Hamilton cycle is just dp[V][0]. (Convince yourself, maybe looking at (*).)
We need to compute dp[A][x]. For the easy cases where A = {x}, the answer is just the length of edge x→0. Otherwise, we focus on the first step of the path. If the first step is x→y, with edge length q, then we pay dp[A - x][y] + q. In general, dp[A][x] is
Section B. Wrap everything into a small world
For any string, define the number of switches to be the number of times adjacent characters are different in the string. We want to find a permutation that transforms S to one S' where the number of switches is minimal. Assume the length of S is mk. Then S can be viewed as a string with m blocks of length k.
Now we introduce a visual aid to simplify our writing. Let us draw the string S as m rows, each block on a single row. The key image is to count the number of switches one column at a time.
Let us take a semi-concrete example. Suppose that at one point we have decided that 5 is permuted to the 7th position, and that 2 goes to the 8th position. Then without knowing the rest of the permutation, we can inspect the 5th and 2nd characters in each block. Suppose that in Z of the blocks the 5th and the 2nd characters are different, then we know that in any such permutation, we will have to pay the price of Z.
The one exception is the last element of the permutation. In all cases but one, we simply wrap around to the beginning because the end of each k-block touches the beginning of the next k-block in the string, except for the last character in the string. We can handle both cases if we fix the last element of the permutation by trying all possibilities.
Next, we reduce our problem to the one in Section A. Suppose we fix T as the last element in the permutation. Define a weighted, directed, complete graph G on k vertices {1, 2, ..., k}. The weight on the edge x→y is
It is easy to check that for any permutation, the number of switches is the same as the length of the corresponding Hamiltonian cycle in G.
We have k different choices for T. For each T, finding the shortest Hamilton cycle takes O(2k k) time. The construction of the graph takes O(k2m) = O(k |S|) time for each T; it is also easy to construct in O(k2m) time the graphs for all the T's. The running time of the solution is O(2k k + k |S|).
#include "stdafx.h" #include <string> #include <cmath> #include <queue> #include <algorithm> #include <iostream> #define PI 3.14159265358979323846264338327950288 #define INF 1000000 #define _clr(a,b) memset(a,b,sizeof(a)) template<class T> T _abs(T a) { if(a<0) return -a;return a;} template<class T> void get_min(T& a,T b) { if(a>b) a=b;} template<class T> void get_max(T& a,T b) { if(a<b) a=b;} using namespace std; int map[16][16]; int cross[16][16]; int state[16][16][1<<16]; int K,len; char str[50005]; int travle(int start,int end,int mask)//求最短哈密顿通路,mask是集合的作用,保存已经到过的点。这里用的是动态规划。 { if(state[start][end][mask]!=-1) return state[start][end][mask];//状态曾经到达过,直接返回结果 if(mask==((1<<K)-1)) return map[start][end];//没有剩余可用的节点了,即没有中间节点,直接返回路径长度 state[start][end][mask]=INF; int temp; for(int i=0;i<K;i++)//枚举下一节点 { if((1<<i)&mask) continue; temp=mask|(1<<i); get_min(state[start][end][mask],map[start][i]+travle(i,end,temp)); } return state[start][end][mask]; } int main() { freopen("e://1.in","r",stdin); freopen("e://1.out","w",stdout); int T; scanf("%d",&T); for(int t=1;t<=T;t++) { scanf("%d%s",&K,str); len=strlen(str); _clr(map,0); _clr(cross,0); for(int i=0;i<len;i+=K) { for(int j=0;j<K;j++) for(int k=0;k<K;k++) { map[j][k]+=(str[i+j]==str[i+k]?0:1); if(i>=K) { cross[j][k]+=(str[i-K+j]==str[i+k]?0:1); } } } int ans=INF; _clr(state,-1); for(int i=0;i<K;i++)//因为下一段的起始点和前一段的终结点会连接在一起,所以需要特殊考虑,这里采用的枚举。 { for(int j=0;j<K;j++) { if(i==j) continue; int mask=0; mask|=1<<i; mask|=1<<j; get_min(ans,travle(i,j,mask)+cross[i][j]); } } printf("Case #%d: %d/n",t,ans+1); } return 0; }