In 1953, David A. Huffman published his paper “A Method for the Construction of Minimum-Redundancy Codes”, and hence printed his name in the history of computer science. As a professor who gives the final exam problem on Huffman codes, I am encountering a big problem: the Huffman codes are NOT unique. For example, given a string “aaaxuaxz”, we can observe that the frequencies of the characters ‘a’, ‘x’, ‘u’ and ‘z’ are 4, 2, 1 and 1, respectively. We may either encode the symbols as {‘a’=0, ‘x’=10, ‘u’=110, ‘z’=111}, or in another way as {‘a’=1, ‘x’=01, ‘u’=001, ‘z’=000}, both compress the string into 14 bits. Another set of code can be given as {‘a’=0, ‘x’=11, ‘u’=100, ‘z’=101}, but {‘a’=0, ‘x’=01, ‘u’=011, ‘z’=001} is NOT correct since “aaaxuaxz” and “aazuaxax” can both be decoded from the code 00001011001001. The students are submitting all kinds of codes, and I need a computer program to help me determine which ones are correct and which ones are not.
Each input file contains one test case. For each case, the first line gives an integer N (2≤N≤63), then followed by a line that contains all the N distinct characters and their frequencies in the following format:
c[1] f[1] c[2] f[2] … c[N] f[N]
where c[i] is a character chosen from {‘0’ - ‘9’, ‘a’ - ‘z’, ‘A’ - ‘Z’, ‘_’}, and f[i] is the frequency of c[i] and is an integer no more than 1000. The next line gives a positive integer M (≤1000), then followed by M student submissions. Each student submission consists of N lines, each in the format:
c[i] code[i]
where c[i] is the i-th character and code[i] is an non-empty string of no more than 63 '0’s and '1’s.
For each test case, print in each line either “Yes” if the student’s submission is correct, or “No” if not.
Note: The optimal solution is not necessarily generated by Huffman algorithm. Any prefix code with code length being optimal is considered correct.
7
A 1 B 1 C 1 D 3 E 3 F 6 G 6
4
A 00000
B 00001
C 0001
D 001
E 01
F 10
G 11
A 01010
B 01011
C 0100
D 011
E 10
F 11
G 00
A 000
B 001
C 010
D 011
E 100
F 101
G 110
A 00000
B 00001
C 0001
D 001
E 00
F 10
G 11
Yes
Yes
No
No
总体思路:利用堆实现哈夫曼建树过程中最小元素的删除,计算WPL与测试数据是否相同,若相同且测试数据符合前缀编码,则Yes
WPL的计算:树的带权路径长度等于叶子节点的带权路径长度之和 ->等于非叶子结点的权值之和->等于除根节点以外所有节点的权值之和(第二个推导较好理解,所有叶节点加起来等于根节点)
第一个推导举例:
WPL = 12 * 1 + 1 * 4 + 2 * 4 + 4 * 3 + 5 * 3 + 6 * 3 = 69
WPL = 30+18+7+11+3 =(12+1+2+4+5+6)+(1+2+4+5+6)+(1+2+4)+(5+6)+(1+2)= 69
判断是否是前缀编码通过两个字符串指针一直往后移,指导两指针内容不同,若此时一个字符串指针指向了结尾,便判断不是前缀变(借鉴他人的,总感觉能够优化)
犯的错误:(1)Delete函数忘记else,导致错了很长时间 (2) i+1没保证 <= size 导致问题又出了很长时间
#include
#include
#include