lzw字符串压缩算法实现

lzw算法思想举例:

原输入数据为:A B A B A B A B B B A B A B A A C D A C D A D C A B A A A B A B .....
采用LZW算法对其进行压缩,压缩过程用一个表来表述为:
注意原数据中只包含4个character,A,B,C,D
用两bit即可表述,根据lzw算法,首先扩展一位变为3为,Clear=2的2次方+1=4; End=4+1=5;
初始标号集因该为


0 1 2 3 4 5
A B C D Clear End

而压缩过程为:

第几步 前缀 后缀 Entry 认识(Y/N) 输出 标号
1 A (,A)
2 A B   (A,B)       N A 6
3 B A   (B,A)       N B 7
4 A B   (A,B)       Y
5 6 A   (6,A)       N 6 8
6 A B   (A,B)       Y
7 6 A   (6,A)       Y
8 8 B   (8,B)       N 8 9
9 B B   (B,B)       N B 10
10 B B   (B,B)       Y
11 10 A   (10,A)       N 10 11
12 A B   (A,B)       Y

.....

当进行到第12步的时候,标号集应该为

0 1 2 3 4 5 6 7 8 9 10 11
A B C D Clear End AB BA 6A 8B BB 10A


算法实现:

#include 
#include 
#include 
#include 
#include 

// Compress a string to a list of output symbols.
// The result will be written to the output iterator
// starting at "result"; the final iterator is returned.
void compress(const std::string &uncompressed, std::vector& vec) {
	// Build the dictionary.
	int dictSize = 256;
	std::map dictionary;
	for (int i = 0; i < 256; i++)
	{
		dictionary[std::string(1, i)] = i;   
   	}
		
	std::string w;
	for (std::string::const_iterator it = uncompressed.begin();
		 it != uncompressed.end(); ++it) {
		char c = *it;
		std::string wc = w + c;
		if (dictionary.count(wc))
			w = wc;
		else {
			vec.push_back(dictionary[w]);
			// Add wc to the dictionary.
			dictionary[wc] = dictSize++;
			w = std::string(1, c);
		}
	}
 
	// Output the code for w.
	if (!w.empty())
		vec.push_back( dictionary[w]);
}
 
// Decompress a list of output ks to a string.
// "begin" and "end" must form a valid range of ints

std::string decompress(std::vector& vec) {
	// Build the dictionary.
	int dictSize = 256;
	std::map dictionary;
	for (int i = 0; i < 256; i++)
		dictionary[i] = std::string(1, i);

	std::vector::iterator it = vec.begin();
	std::string w(1, *it);
	std::string result = w;
	std::string entry;
	for ( it++; it != vec.end(); it++) {
		int k = *it;
		if (dictionary.count(k))
			entry = dictionary[k];
		else if (k == dictSize)
			entry = w + w[0];
		else
			throw "Bad compressed k";
 
		result += entry;
 
		// Add w+entry[0] to the dictionary.
		dictionary[dictSize++] = w + entry[0];
 
		w = entry;
	}
	return result;
}
 
int main() {
	std::vector compressed;
	compress("TOBEORNOTTOBEORTOBEORNOT", compressed);
	copy(compressed.begin(), compressed.end(), std::ostream_iterator(std::cout, ", "));
	std::cout << std::endl;
	std::string decompressed = decompress(compressed);
	std::cout << decompressed << std::endl;

	return 0;
}



各种版本的语言实现地址:

Lzw代码实现

http://rosettacode.org/wiki/LZW_compression

你可能感兴趣的:(c/c++,算法/数据结构)