哈夫曼编码(数据结构实验)

前言

哈夫曼编码又称最优树,是一种典型的贪心算法,这种编码方式最大的优点就是用最少的字符包含最多的信息。
哈夫曼编码是一种前缀编码,或者称非前缀编码,这种编码的特点是没有任何字是其他码的前缀。

步骤

1、创建一个优先级队列
当然不一定要用优先级队列,也可以用普通数组代替,相比优先队列,普通数组在程序中要每次都比较节点权重的大小

2、构建哈夫曼树
对于给定的有各自权值的 n 个结点;

  1. 在 n 个权值中选出两个最小的权值,对应的两个结点组成一个新的二叉树,且新二叉树的根结点的权值为左右孩子权值的和;
  2. 在原有的 n 个权值中删除那两个最小的权值,同时将新的权值加入到 n–2 个权值的行列中,以此类推;
  3. 重复 1 和 2 ,直到所以的结点构建成了一棵二叉树为止,这棵树就是哈夫曼树。

3、哈夫曼编码
使用程序求哈夫曼编码有两种方法:

  1. 从叶子结点一直找到根结点,逆向记录途中经过的标记。例如,图 1 中字符 c 的哈夫曼编码从结点 c 开始一直找到根结点,结果为:0 1 1 ,所以字符 c 的哈夫曼编码为:1 1 0(逆序输出)。
  2. 从根结点出发,一直到叶子结点,记录途中经过的标记。例如,求图 1 中字符 c 的哈夫曼编码,就从根结点开始,依次为:1 1 0。

4、代码(可运行)
题目:构造哈夫曼树和哈夫曼编码的算法实现
统计下面一段英文的不同字符个数和每个字符的出现频率,利用统计数据构造构造哈夫曼树和哈夫曼编码
The Chinese official said he viewed the Trump Presidency not as an aberration but as the product of a failing political system. This jibes with other accounts. The Chinese leadership believes that the United States, and Western democracies in general, haven’t risen to the challenge of a globalized economy, which necessitates big changes in production patterns, as well as major upgrades in education and public infrastructure. In Trump and Trumpism, the Chinese see an inevitable backlash to this failure.

#include <iostream>
#include<map>
#include<string>
#include<iomanip>
#include<vector>
using namespace std;
struct Huffmantree {
     
	int weight;
	int parent;
	int lchild;
	int rchild;
};
void select(Huffmantree*& ht, int i, int& s1, int& s2)//找出哈夫曼树表最小的两个数
{
     
	int j = 1;
	int k = 1;
	while (ht[k].parent != 0) k++;//找出双亲为0的标号
	s1 = k;
	for (j = 1; j <= i; j++)
	{
     //找出最小值
		if (ht[j].parent == 0 && ht[j].weight <= ht[s1].weight)
		{
     
			s1 = j;
		}
	}
	k = 1;
	while (ht[k].parent != 0 || k == s1) k++;
	s2 = k;
	for (j = 1; j <= i; ++j)
	{
     //找出第二小值
		if (ht[j].parent == 0 && ht[j].weight <= ht[s2].weight && j != s1)
		{
     
			s2 = j;
		}
	}
}
void CreatHuffmantree(Huffmantree*& ht, int n, map<char, int>tmp)
{
     
	if (n <= 1) {
     //空树处理
		cout << "树为空树!" << endl; return;
	}
	int m = 2 * n - 1;
	int s1;
	int s2;
	ht = new Huffmantree[m + 1];//创建哈夫曼表
	for (int i = 1; i <= m; i++)
	{
     //将哈夫曼表的双亲,左右子树均置0
		ht[i].parent = 0;
		ht[i].lchild = 0;
		ht[i].rchild = 0;
	}
	int i = 1;
	for (auto e : tmp)
	{
     //将字符出现频率作为权重输入哈夫曼表
		ht[i].weight = e.second;
		i++;
	}
	for (i = n + 1; i <= m; i++)
	{
     
		select(ht, i - 1, s1, s2);
		ht[s1].parent = i;//置双亲
		ht[s2].parent = i;
		ht[i].lchild = s1;//输入i的左右子树
		ht[i].rchild = s2;
		ht[i].weight = ht[s1].weight + ht[s2].weight;//值i的权重
	}
}

void CreateHuffmanCode(Huffmantree* ht, vector<string>& hc, int index, int position)//创建哈夫曼树编码
{
     
	while (ht[position].parent != 0)
	{
     //当结点有双亲时
		int temp = ht[position].parent;//记录结点双亲位置
		if (ht[temp].lchild == position)//判断结点是属于双亲的左节点还是右结点,若左,则编码置0,若右,编码置1
			hc[index].insert(hc[index].begin(), '0');
		else if (ht[temp].rchild == position)
			hc[index].insert(hc[index].begin(), '1');
		position = ht[position].parent;//将结点位置置为双亲位置
	}
}
int main()
{
     
	map<char, int>tmp;//map映射字符与出现频率
	string data = "The Chinese official said he viewed the Trump Presidency not as an aberration but as the product of a failing political system. This jibes with other accounts. The Chinese leadership believes that the United States, and Western democracies in general, haven't risen to the challenge of a globalized economy, which necessitates big changes in production patterns, as well as major upgrades in education and public infrastructure. In Trump and Trumpism, the Chinese see an inevitable backlash to this failure.";
	for (auto e : data)//对文段进行统计
	{
     
		tmp[e]++;
	}
	cout << "字符统计结果:" << endl;
	cout << "字符" << setw(6) << "数量" << endl;
	vector<char> letter;
	letter.push_back(' ');
	for (auto e : tmp)
	{
     
		cout << e.first << '\t' << e.second << endl;
		letter.push_back(e.first);
	}//输入文段统计结果,并把字符出现频率记录到vector中
	cout << "不同字符个数为" << tmp.size() << endl;
	Huffmantree* tree;
	CreatHuffmantree(tree, tmp.size(), tmp);//建立哈夫曼树
	vector<string> hc(tmp.size() + 1, "");//建立vector存放哈夫曼编码,将string全初始化为空串
	cout << endl << "各个字符的哈夫曼编码如下:" << endl;
	for (int i = 1; i <= tmp.size(); i++)
	{
     
		CreateHuffmanCode(tree, hc, i, i);
		cout << letter[i] << ":" << hc[i] << endl;
	}//输出哈夫曼编码
}

如有提议请指正。欢迎大家点评。

你可能感兴趣的:(数据结构,数据结构,算法,二叉树)