哈夫曼编码测试

哈夫曼树

定义

哈夫曼树,又称最优树,是一类带权路径长度最短的树

创建哈夫曼树

1)从F中选取两棵根结点权值最小的树作为左右子树构造一棵新的二叉树,其新的二叉树的权值为其左右子树根结点权值之和;
(2)从F中删除上一步选取的两棵二叉树,将新构造的树放到F中;
(3)重复(1)(2),直到F只含一棵树为止
哈夫曼编码测试_第1张图片

哈夫曼编码

约定左分支表示字符'0',右分支表示字符'1',在哈夫曼树中从根结点开始,到叶子结点的路径上分支字符组成的字符串为该叶子结点的哈夫曼编码。上面代码所创建的哈夫曼树如下所示:
哈夫曼编码测试_第2张图片

可以看出3被编码为00,1为010,2为011,4为10,5为11。在这些编码中,任何一个字符的编码均不是另一个字符编码的前缀。
根据这个思路,就简单的写了一个较简单的只包含六个英语字符的测试

public class Huffman {

    private static class TreeNode implements Comparable{
        TreeNode left;
        TreeNode right;
        int weight;
        char ch;
        String code;

        public TreeNode(int weight,TreeNode left,TreeNode right) {
            this.weight = weight;
            this.left = left;
            this.right = right;
            this.code = "";
        }
        @Override
        public int compareTo(TreeNode o) {
            if (this.weight > o.weight) {
                return 1;
            }else if (this.weight < o.weight) {
                return -1;
            }else {
                return 0;
            }
        }
    }

    public static TreeNode huffman(TreeMap data) {
        TreeSet tNodes = new TreeSet<>();
        Set weights = data.keySet();
        Iterator iterator = weights.iterator();
        while (iterator.hasNext()) {
            int weight = iterator.next();
            TreeNode tmp = new TreeNode(weight, null, null);
            tmp.ch = data.get(weight);
            tNodes.add(tmp);
        }
        while (tNodes.size() > 1) {
            TreeNode leftNode = tNodes.pollFirst();
            leftNode.code = "0";
            TreeNode rightNode = tNodes.pollFirst();
            rightNode.code = "1";
            TreeNode newNode = new TreeNode(leftNode.weight+rightNode.weight,
                    leftNode, rightNode);
            tNodes.add(newNode);
        }
        return tNodes.first();
    }

    private static void code(TreeNode t) {
        if (t.left != null) {
            t.left.code = t.code + t.left.code;
            code(t.left);
        }
        if (t.right != null) {
            t.right.code = t.code + t.right.code;
            code(t.right);
        }
    }

    public static void print(TreeNode root) {
        if (root != null) {
            if (root.left == null && root.right == null) {
                System.out.println(root.ch + " 编码:" + root.code);
            }else {
                print(root.left);
                print(root.right);
            }
        }
    }

    public static void main(String[] args) {
        TreeMap test = new TreeMap<>();
        test.put(5, 'F');
        test.put(9, 'E');
        test.put(12, 'C');
        test.put(13, 'B');
        test.put(16, 'D');
        test.put(45, 'A');

        TreeNode root = huffman(test);
        code(root);
        print(root);
    }

}

哈夫曼编码测试_第3张图片

然后在此基础上进行读入文件的操作,将2018年12月4级作文输入,转化为编码,最后再根据key和value的对应值将编码进行解码

 //读取文件中出现的字符,并记录出现次数
        File file = new File("C:\\Users\\96553\\IdeaProjects\\zc\\src\\chap15\\text");
BufferedReader fin = new BufferedReader(new FileReader(file));
        String line;

        //map储存数据的形式是一个key和一个value对应
         //读取一个文本行
         //层序遍历后建立哈夫曼树
         //...
 //解码并写入文件
        for (int i = 0; i < result1.length(); i++)
        {
            list5.add(result1.charAt(i) + "");
        }
        while (list5.size() > 0) {
            temp2 = temp2 + "" + list5.get(0);
            list5.remove(0);
            for (int i = 0; i < list4.size(); i++)
            {
                if (temp2.equals(list4.get(i))) {
                    temp3 = temp3 + "" + list3.get(i);
                    temp2 = "";
                }
            }
        }

代码链接如下
实现结果如下:
原文:

In recent years, second-hand transactions have become quite common. Nowadays there are more and more secondhand goods in the market, such as secondhand books, furniture, appliances, cars, and so on. Why do so many people like to buy secondhand goods?
The following reasons can account for this phenomenon. Above all, secondhand goods are cheaper than new ones. This enables those people who have poor financial abilities to buy the things they want. Moreover, secondhand goods transactions make it possible for people to make good use of the goods which may be useless in their hands. Besides, Internet provides a more convenient and quicker transaction platform for secondhand goods.

编码后的:

011110001011110111010101010001010110001110000011010100111101100001101111010000101010000110111111101100001111110100110111111111000011110110011011100010011010000011010100110111000110111101001011111101011001110001010100001111000010110011000100111011010100010101101010000111100011100000110110000101100111100110001011111010011111110010000111000110000111110010111010101101001111010101101110000011110101011010011011111111101110000011110101011010000101010000110111111111110100110111111111001101000100111111100011010101101111000011111001011011100010011110101111010100001011011110100001110110100111101101001100011010000101010000110111111111110100110111111111001110000100101111011000011011110011001011101111011011101010001011101111010100110111101001111001111001000001010110011011101000101000011011110101001001111011000011011110100110111111111010000011100011011000010110011110011111110000011110111110011101000001110111000100110110000111101110010100011110010000001011000000101010111101010110000100111001110001110100001111010000101010000110111111111110100110111111111001101000100111111100001100011001100000111100101100110010010000000000001011111010101101101101011011101010100110000011011100011010100100110111101001101001010000101110110110001110011001001111011100001111101010110001101110011111001010110011110000101011001101100001011001100011101110000101111110101101001000000000001101111010000101010000110111111111110100110111111111001101000100111111100011010011110101011010100111100101001111001010111011100001111101001101111010110100111110110001101101010000000101100110000011110101011000110010101110010111000000001010001100001111100011000010110111001010001111001000000101100111110111100011101111010010111111010110111001001001111011100110011010110111001101110100101011001000001101001011100101010000010101000110101010100011000010011100111000111010000111100001111100101100001111101010110110110101000110000111110010000011110011111010011011000100001011001100001000111101010001011111101011101011011110100001010100001101111111111101001101111111110011010001001111111000110000111101100110111000100110100000110101001101110001101110001001011110101011010101000111011100100110001000101010111000000001011001100100111101110111001010001111001000000101100001001110111000100101111010101100110100010011111111001110110000101100010110011100001111100101100110100010011111110001100111110111101010110100111101101110001001000011110011100010110011101100001000000010100010001101010110111100001111100101010111101110111101001101111111100000001011001111001001010001010111111010100001101111001111000101100010101110110110100001110111001111010010111111101011111101010001101001110111000001111010101101010000110110111111010101110101010101100011101001101111111110011000100111011010110100011110101011101110000111101100110111000100110100000110101001101111011100100000100100010110010011110111100011001100100111101110100001010100001101111111111101001101111111110011010001001111111000000010

哈夫曼编码测试_第4张图片

哈夫曼编码测试_第5张图片

哈夫曼编码测试_第6张图片

你可能感兴趣的:(哈夫曼编码测试)