【原创】Jpeg渐进式图像数据解析

为了更好的优化客户端体验,客户端在图像压缩的时候采用了渐进式Jpeg压缩。渐进式Jpeg的好处是,只需要很少的一部分数据包,就能够解码出一副完整的图像,随着数据的增加,图像会不断变清晰。渐进式图像还有一个好处是每一处SOS的Huffman编码都是优化编码,平均图像size会小一些。


对于渐进式Jpeg压缩编码,在libjpeg中设置jpeg_simple_progression(&cinfo);就可以了,这是libjpeg默认的扫描脚本。也可以用自定义的扫描脚本。例如用config_scan_param函数来设定脚本:


bool config_scan_param(j_compress_ptr cinfo)
{
    int scanno=0;
    jpeg_scan_info * scanptr=NULL;
    jpeg_scan_info scans[100];

    // 手动填充数组,根据jpegcrush的扫描表
    fill_jpegcrush_scan_script(scanno, scans);

    if (scanno > 0) {
        /* Stash completed scan list in cinfo structure.
        * NOTE: for cjpeg's use, JPOOL_IMAGE is the right lifetime for this data,
        * but if you want to compress multiple images you'd want JPOOL_PERMANENT.
        */
        scanptr = (jpeg_scan_info *)
            (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_IMAGE,
            scanno * sizeof(jpeg_scan_info));
        memcpy(scanptr, scans, scanno * sizeof(jpeg_scan_info));
        cinfo->scan_info = scanptr;
        cinfo->num_scans = scanno;
    }

    return TRUE;
}

void fill_jpegcrush_scan_script(int & scanno, jpeg_scan_info * scans)
{
    scanno = 10;

    scans[0].comps_in_scan=1;
    scans[0].component_index[0]=0;
    scans[0].Ss=0;
    scans[0].Se=0;
    scans[0].Ah=0;
    scans[0].Al=0;

    scans[1].comps_in_scan=2;
    scans[1].component_index[0]=1;
    scans[1].component_index[1]=2;
    scans[1].Ss=0;
    scans[1].Se=0;
    scans[1].Ah=0;
    scans[1].Al=0;

    scans[2].comps_in_scan=1;
    scans[2].component_index[0]=0;
    scans[2].Ss=1;
    scans[2].Se=8;
    scans[2].Ah=0;
    scans[2].Al=2;

    scans[3].comps_in_scan=1;
    scans[3].component_index[0]=1;
    scans[3].Ss=1;
    scans[3].Se=8;
    scans[3].Ah=0;
    scans[3].Al=0;

    scans[4].comps_in_scan=1;
    scans[4].component_index[0]=2;
    scans[4].Ss=1;
    scans[4].Se=8;
    scans[4].Ah=0;
    scans[4].Al=0;

    scans[5].comps_in_scan=1;
    scans[5].component_index[0]=0;
    scans[5].Ss=9;
    scans[5].Se=63;
    scans[5].Ah=0;
    scans[5].Al=2;

    scans[6].comps_in_scan=1;
    scans[6].component_index[0]=0;
    scans[6].Ss=1;
    scans[6].Se=63;
    scans[6].Ah=2;
    scans[6].Al=1;

    scans[7].comps_in_scan=1;
    scans[7].component_index[0]=0;
    scans[7].Ss=1;
    scans[7].Se=63;
    scans[7].Ah=1;
    scans[7].Al=0;

    scans[8].comps_in_scan=1;
    scans[8].component_index[0]=1;
    scans[8].Ss=9;
    scans[8].Se=63;
    scans[8].Ah=0;
    scans[8].Al=0;

    scans[9].comps_in_scan=1;
    scans[9].component_index[0]=2;
    scans[9].Ss=9;
    scans[9].Se=63;
    scans[9].Ah=0;
    scans[9].Al=0;
}

这些脚本主要定义了渐进式编码后的图像有多少个SOS,每个SOS包含几个数据分量,以及之后的是DC还是AC数据,具体又是AC的那些频率段。

libjpeg的wizard.txt中是这么定义的:

The progression parameters for each scan are:
    Ss    Zigzag index of first coefficient included in scan
    Se    Zigzag index of last coefficient included in scan
    Ah    Zero for first scan of a coefficient, else Al of prior scan
    Al    Successive approximation low bit position for scan
If the progression parameters are omitted, the values 0,63,0,0 are used,
producing a sequential JPEG file.  cjpeg automatically determines whether
the script represents a progressive or sequential file, by observing whether
Ss and Se values other than 0 and 63 appear.  (The -progressive switch is
not needed to specify this; in fact, it is ignored when -scans appears.)
The scan script must meet the JPEG restrictions on progression sequences.
(cjpeg checks that the spec's requirements are obeyed.)

Scan script files are free format, in that arbitrary whitespace can appear
between numbers and around punctuation.  Also, comments can be included: a
comment starts with '#' and extends to the end of the line.  For additional
legibility, commas or dashes can be placed between values.  (Actually, any
single punctuation character other than ':' or ';' can be inserted.)  For
example, the following two scan definitions are equivalent:
1 2: 0 63 0 0;
    0,1,2 : 0-63, 0,0 ;

Here is an example of a scan script that generates a partially interleaved
sequential JPEG file:

    0;            # Y only in first scan
2;            # Cb and Cr in second scan

Here is an example of a progressive scan script using only spectral selection
(no successive approximation):

    # Interleaved DC scan for Y,Cb,Cr:
    0,1,2: 0-0,   0, 0 ;
    # AC scans:
    0:     1-2,   0, 0 ;    # First two Y AC coefficients
    0:     3-5,   0, 0 ;    # Three more
    1:     1-63,  0, 0 ;    # All AC coefficients for Cb
    2:     1-63,  0, 0 ;    # All AC coefficients for Cr
    0:     6-9,   0, 0 ;    # More Y coefficients
    0:     10-63, 0, 0 ;    # Remaining Y coefficients

Here is an example of a successive-approximation script.  This is equivalent
to the default script used by "cjpeg -progressive" for YCbCr images:

    # Initial DC scan for Y,Cb,Cr (lowest bit not sent)
    0,1,2: 0-0,   0, 1 ;
    # First AC scan: send first 5 Y AC coefficients, minus 2 lowest bits:
    0:     1-5,   0, 2 ;
    # Send all Cr,Cb AC coefficients, minus lowest bit:
    # (chroma data is usually too small to be worth subdividing further;
    #  but note we send Cr first since eye is least sensitive to Cb)
    2:     1-63,  0, 1 ;
    1:     1-63,  0, 1 ;
    # Send remaining Y AC coefficients, minus 2 lowest bits:
    0:     6-63,  0, 2 ;
    # Send next-to-lowest bit of all Y AC coefficients:
    0:     1-63,  2, 1 ;
    # At this point we've sent all but the lowest bit of all coefficients.
    # Send lowest bit of DC coefficients
    0,1,2: 0-0,   1, 0 ;
    # Send lowest bit of AC coefficients
    2:     1-63,  1, 0 ;
    1:     1-63,  1, 0 ;
    # Y AC lowest bit scan is last; it's usually the largest scan
    0:     1-63,  1, 0 ;

It may be worth pointing out that this script is tuned for quality settings
of around 50 to 75.  For lower quality settings, you'd probably want to use
a script with fewer stages of successive approximation (otherwise the
initial scans will be really bad).  For higher quality settings, you might
want to use more stages of successive approximation (so that the initial
scans are not too large).

其实一般用jpeg_simple_progression默认的脚本就好了。那么这样压缩后的图像是如何的呢,解码的时候又要注意些什么呢?先附上Common Marker表:

采用下图作为测试样本,该图像为压缩大小为25.5kb的渐进式jpeg图像。

用JPEGsnoop先打开该图像,数据排列如下:

1,SOI,xFFD8,图像开始标记

2,APP0-15,xFFE0-xFFEF,图像标记,可能是相机,手机系统等信息

3,DQT0, xFFDB,Destination ID=0 (Luminance),也即Y通道量化

4,DQT1,xFFDB, Destination ID=1 (Chrominance),也即CbCr通道的量化表

5,SOF2,xFFC2,渐进式图像开始标记。普通baseline编码是SOF0,也即xFFC0。信息如下:

Number of Lines = 960
  Samples per Line = 541
  Image Size = 541 x 960
  Raw Image Orientation = Portrait
  Number of Img components = 3
    Component[1]: ID=0x01, Samp Fac=0x22 (Subsamp 1 x 1), Quant Tbl Sel=0x00 (Lum: Y)
    Component[2]: ID=0x02, Samp Fac=0x11 (Subsamp 2 x 2), Quant Tbl Sel=0x01 (Chrom: Cb)
    Component[3]: ID=0x03, Samp Fac=0x11 (Subsamp 2 x 2), Quant Tbl Sel=0x01 (Chrom: Cr)

可见里面包括了图像宽高,采样率,分量数,每个分量选用几号量化表等等。下面就是10轮SOS以及对应的DHT了。

6,DHT,xFFC4, Destination ID = 0,Class = 0 (DC / Lossless Table),这一张表示0号直流(DC)表,一般是给Y通道的。

7,DHT,xFFC4, Destination ID = 1,Class = 0 (DC / Lossless Table),这一张表示1号直流(DC)表,一般是给CbCr通道的(有的CbCr通道也可能用0号DC表)。

8,SOS1,xFFDA,数据如下:

Scan header length = 12
  Number of img components = 3
    Component[1]: selector=0x01, table=0(DC),0(AC)
    Component[2]: selector=0x02, table=1(DC),0(AC)
    Component[3]: selector=0x03, table=1(DC),0(AC)
  Spectral selection = 0 .. 0
  Successive approximation = 0x01

可见第一个SOS包含了整个图像三个通道的直流信息,也即每个8*8块的图像均值。选用上面的DHT0和DHT1两张Huffman表。传输数据只有1个频率段,也即0。注意的是, Successive approximation = 0x0,1说明,每字节8bit数据的最低一位不编码,这一位在SOS7中编码。那么就是这个效果,3.06kb:

9,DHT,xFFC4, Destination ID = 0,Class = 1 (AC Table),这一张表示0号交流(AC)表

10,SOS2,xFFDA,数据如下:

  Scan header length = 8
  Number of img components = 1
    Component[1]: selector=0x01, table=0(DC),0(AC)
  Spectral selection = 1 .. 5
  Successive approximation = 0x02

可见这段数据是Y通道的AC数据,频率段为1-5,DHT选用0号AC表, Successive approximation = 0x0,2说明每字节最后两位不算。解码图像如下,7.67kb:

11,DHT,xFFC4, Destination ID = 1,Class = 1 (AC Table),这一张表示1号交流(AC)表

12,SOS3,xFFDA,数据如下:

 Scan header length = 8
  Number of img components = 1
    Component[1]: selector=0x03, table=0(DC),1(AC)
  Spectral selection = 1 .. 63
  Successive approximation = 0x01

这部分是3号Cr通道的AC数据,频率段是1-63,采用1号交流表,每字节最后一位不算。解码图像如下,8.19kb:

13,DHT,xFFC4, Destination ID = 1,Class = 1 (AC Table),这一张表示1号交流(AC)表

14,SOS4,xFFDA,数据如下:

 Scan header length = 8
  Number of img components = 1
    Component[1]: selector=0x02, table=0(DC),1(AC)
  Spectral selection = 1 .. 63
  Successive approximation = 0x01

这部分是2号Cb通道的AC数据,频率段是1-63,采用1号交流表,每字节最后一位不算。解码图像如下,8.92kb:

15,DHT,xFFC4, Destination ID = 0,Class = 1 (AC Table),这一张表示0号交流(AC)表

16,SOS5,xFFDA,数据如下:

  Scan header length = 8
  Number of img components = 1
    Component[1]: selector=0x01, table=0(DC),0(AC)
  Spectral selection = 6 .. 63
  Successive approximation = 0x02

这部分是1号Y通道的AC数据,频率段是6-63,采用0号交流表,每字节最后两位不算。到这里,所有频率段的数据都传输了。只剩下每个频率对应字节的后几位数据。解码图像如下,15.1kb:

17,DHT,xFFC4, Destination ID = 0,Class = 1 (AC Table),这一张表示0号交流(AC)表

18,SOS6,xFFDA,数据如下:

  Scan header length = 8
  Number of img components = 1
    Component[1]: selector=0x01, table=0(DC),0(AC)
  Spectral selection = 1 .. 63
  Successive approximation = 0x21

这部分是1号Y通道的AC数据,频率段是1-63,采用0号交流表。 Successive approximation = 0x2,1说明只传递倒数第二低位字节。解码图像如下,22.9kb:


19,SOS7,xFFDA,数据如下:

  Scan header length = 12
  Number of img components = 3
    Component[1]: selector=0x01, table=0(DC),0(AC)
    Component[2]: selector=0x02, table=0(DC),0(AC)
    Component[3]: selector=0x03, table=0(DC),0(AC)
  Spectral selection = 0 .. 0
  Successive approximation = 0x10

这部分是3通道的DC数据,频率段是0,采用0号直流表,只传递最低字节。解码图像如下,23.5kb:

20,DHT,xFFC4, Destination ID = 1,Class = 1 (AC Table),这一张表示1号交流(AC)表

21,SOS8,xFFDA,数据如下:

 Scan header length = 8
  Number of img components = 1
    Component[1]: selector=0x03, table=0(DC),1(AC)
  Spectral selection = 1 .. 63
  Successive approximation = 0x10

这部分是3号Cr通道的AC数据,频率段是1-63,采用0号交流表。Successive approximation = 0x1,0说明只传递最低字节。解码图像24.5kb。

22,DHT,xFFC4, Destination ID = 1,Class = 1 (AC Table),这一张表示1号交流(AC)表

23,SOS9,xFFDA,数据如下:

 Scan header length = 8
  Number of img components = 1
    Component[1]: selector=0x02, table=0(DC),1(AC)
  Spectral selection = 1 .. 63
  Successive approximation = 0x10

这部分是2号Cb通道的AC数据,频率段是1-63,采用0号交流表。Successive approximation = 0x1,0说明只传递最低字节。

24,DHT,xFFC4, Destination ID = 0,Class = 1 (AC Table),这一张表示0号交流(AC)表

25,SOS10,xFFDA,数据如下:

 Scan header length = 8
  Number of img components = 1
    Component[1]: selector=0x01, table=0(DC),0(AC)
  Spectral selection = 1 .. 63
  Successive approximation = 0x10


这部分是1号Y通道的AC数据,频率段是1-63,采用0号交流表。Successive approximation = 0x1,0说明只传递最低字节。

26,EOI,xFFD9,结尾标识。

这就是整张图像的Marker排列,而真正的图像数据是在每个SOS之后的。由此可见,渐进式图像接受任意的数据量(至少一个完整sos),其实都可以根据已有的数据进行解码,并且给出一张原图宽高的图像的。

但是实际在使用libjpeg解码时,大部分的数据量都能够正常解码,存在极个别的数据量解码失败的情况。经过调试跟踪发现,当数据截断在每个Marker位置时,解码就会出现失败。Marker的数据格式是:

标志(例如0xffda)+长度(两字节)+对应长度的数据(包括长度的两字节)。那么其实只要判断对应Marker是完整的,就可以进行解码,如果不完整,就向前再移动截断位置来解码。



你可能感兴趣的:(API服务)