计算机网络第八部分--数据压缩(英文版本)

Data Compression

  • Coding and Decoding
    • Block Code
  • Huffman Code
  • Arithmetic Code 算术码

Coding and Decoding

Coding is a rule assigning exactly one codeword for each source symbol.

binary coding
if any codeword consists of two symbols (usually ‘0’ and ‘1’).

unique coding
is possible only when arbitrary任意的 two distinct不同的 source messages have distinct code.

block coding
uses pairwise成对的 distinct codewords of length n.
e.g., hexadecimal code 十六进制码, even parity code, ASCII code, etc

instantaneous瞬时 code
no codeword is prefix of another codeword
not all uniquely decodable codes are instantaneous
计算机网络第八部分--数据压缩(英文版本)_第1张图片

Block Code

计算机网络第八部分--数据压缩(英文版本)_第2张图片

Huffman Code

  • instantaneous (prefix) code
  • optimal最佳 symbol code
    – it encodes individual source symbols into a code of variable length
    – there is no other coding scheme that achieves shorter average codeword length
  • derived产生 based on the estimated probability of occurrence of individual source symbols
    计算机网络第八部分--数据压缩(英文版本)_第3张图片

Construction of Huffman code (sketch草图):

  1. list all possible symbols with their probabilities, and locate two symbols with the smallest probabilities.
  2. replace them with a single member containing both of them, whose probability is the sum of them.
  3. repeat these procedures recursively until the list contains only one member. (It can be seen like a binary tree with the original symbols at the leaves.)
  4. in order to form a codeword, trace backward the tree from the root to the leaves, labelling ‘0’ for one branch and ‘1’ for the other.

Arithmetic Code 算术码

  • codeword is not assigned to individual symbols (i.e., not symbol code)
  • represent symbols by intervals间隔
  • encode a stream of source symbols into a single fraction小数 between 0 and 1
  • slightly more efficient than Huffman code
    计算机网络第八部分--数据压缩(英文版本)_第4张图片

假设对FADDE编码

  • block code of length 3: 15 bits
    在这里插入图片描述
  • Huffman code: 12 bits
    在这里插入图片描述
  • arithmetic code :12 bits
    – encode with any number between 0.54256 and 0.54288 — e.g., 0.542724609375, whose binary expression is 0.100010101111.

你可能感兴趣的:(计算机网络第八部分--数据压缩(英文版本))