Okio读写流源码详解(第三篇(GzipSink压缩源码详解))

看源码,首先得熟练掌握这个api怎么用,那么先看看这两个类怎么用的,先看GzipSink怎么用的

/**
	 * 压缩
	 */
	private static void zipCompress() {
		String filePath = "D:/1.txt";

	try {
		
		Sink sink=Okio.sink(new File(filePath));
		BufferedSink gzipSink = Okio.buffer(new GzipSink(sink));
		gzipSink.writeUtf8("中国好男儿");
		gzipSink.flush();
		gzipSink.close();
	} catch (Exception e) {
		// TODO Auto-generated catch block
		e.printStackTrace();
	}
	}
上两篇已经介绍过,Okio最核心的模式是采用了装饰者模式,用 GzipSink装饰 Sink ,再用 BufferedSink 装饰GzipSink,好首先写方法先经过 BufferedSink 处理,然后交给 GzipSink

最后交给Sink ,可以把他们当做加工器,每次产品经过这个生产线,那么这个生成线吧数据加工成它职责的产品。也就是BufferedSink -GzipSink-Sink !那么入BufferedSink writeUtf8方法

@Override public BufferedSink writeUtf8(String string) throws IOException {
    if (closed) throw new IllegalStateException("closed");
    buffer.writeUtf8(string);
    return emitCompleteSegments();
  }
Buffer writeUtf8方法,此方法在上两篇中已详细介绍,这篇不再阐述!就是将数据写到链表中,补充一下这个方法 emitCompleteSegments();

 public BufferedSink emitCompleteSegments() throws IOException {
    if (closed) throw new IllegalStateException("closed");
    long byteCount = buffer.completeSegmentByteCount();
    if (byteCount > 0) sink.write(buffer, byteCount);
    return this;
  }

当向链表写完数据以后,检查链表是否超过两个 Segment,如果超过则将前面的 Segment数据先提前写入最终的字节流中!接下来看 flush

  @Override public void flush() throws IOException {
    if (closed) throw new IllegalStateException("closed");
    if (buffer.size > 0) {
      sink.write(buffer, buffer.size);
    }
    sink.flush();
  }

此时的 sink相当于GzipSink,好,把数据交给GzipSink,GzipSink进行数据压缩,进入GzipSink的 write 方法

 @Override public void write(Buffer source, long byteCount) throws IOException {
    if (byteCount < 0) throw new IllegalArgumentException("byteCount < 0: " + byteCount);
    if (byteCount == 0) return;
    //进行CRC32数据完整性验证
    updateCrc(source, byteCount);
    deflaterSink.write(source, byteCount);
  }
这个方法首先将链表的数据通过 CRC32方法生成一个唯一值,好在读取的时候进行效验,既验证数据有没有被篡改,篡改了就没法读取了并抛异常

通俗的讲CRC就是目前应用最广泛的一种文件完整性的校验算法。不是很了解的小伙伴可以百度一下。

 private void updateCrc(Buffer buffer, long byteCount) {
    for (Segment head = buffer.head; byteCount > 0; head = head.next) {
      int segmentLength = (int) Math.min(byteCount, head.limit - head.pos);
      crc.update(head.data, head.pos, segmentLength);
      byteCount -= segmentLength;
    }
  }
遍历链表中的数据,然后通过Crc32这个对象计算这个唯一值,这个值什么时候用呢?先别急,接着看 deflaterSink.write(source, byteCount);方法,那么最终真正进行压缩的方法是在
 public GzipSink(Sink sink) {
    if (sink == null) throw new IllegalArgumentException("sink == null");
    this.deflater = new Deflater(DEFAULT_COMPRESSION, true /* No wrap */);
    //创建了一个新的bufferSink
    this.sink = Okio.buffer(sink);
    this.deflaterSink = new DeflaterSink(this.sink, deflater);

    writeHeader();
  }

Deflater这个方法中实现的,好进入

 @Override public void write(Buffer source, long byteCount) throws IOException {
    checkOffsetAndCount(source.size, 0, byteCount);
    //循环将要压缩的数据填充到deflater中
    while (byteCount > 0) {
      // Share bytes from the head segment of 'source' with the deflater.
      Segment head = source.head;
      int toDeflate = (int) Math.min(byteCount, head.limit - head.pos);
      //填入要压缩的数据
      deflater.setInput(head.data, head.pos, toDeflate);

      // Deflate those bytes into sink.
      deflate(false);

      // Mark those bytes as read.
      source.size -= toDeflate;
      head.pos += toDeflate;
      //已经用完的链表Segment的回收
      if (head.pos == head.limit) {
        source.head = head.pop();
        SegmentPool.recycle(head);
      }

      byteCount -= toDeflate;
    }
  }
这个方法的核心意思是循环开始读取链表数据填入 deflater,开始压缩,那么deflater又是什么鬼, private final Deflater deflater,jdk给我提供的一个类,专门进行gzip压缩的类,第一次见,对这个类不熟,没关系看一下文档,文档挺人性化的还提供了很全面的例子

try {
 // Encode a String into bytes
 String inputString = "blahblahblah??";
 byte[] input = inputString.getBytes("UTF-8");

 // Compress the bytes
 byte[] output = new byte[100];
 Deflater compresser = new Deflater();
 compresser.setInput(input);
 compresser.finish();
 int compressedDataLength = compresser.deflate(output);

 // Decompress the bytes
 Inflater decompresser = new Inflater();
 decompresser.setInput(output, 0, compressedDataLength);
 byte[] result = new byte[100];
 int resultLength = decompresser.inflate(result);
 decompresser.end();

 // Decode the bytes into a String
 String outputString = new String(result, 0, resultLength, "UTF-8");
 } catch(java.io.UnsupportedEncodingException ex) {
     // handle
 } catch (java.util.zip.DataFormatException ex) {
     // handle
 }
好大体知道 Deflater 就是进行gzip压缩的工具类,那么重点来了,真正开始压缩

 private void deflate(boolean syncFlush) throws IOException {
    Buffer buffer = sink.buffer();
    while (true) {
      Segment s = buffer.writableSegment(1);

      // The 4-parameter overload of deflate() doesn't exist in the RI until
      // Java 1.7, and is public (although with @hide) on Android since 2.3.
      // The @hide tag means that this code won't compile against the Android
      // 2.3 SDK, but it will run fine there.
      int deflated = syncFlush
          ? deflater.deflate(s.data, s.limit, Segment.SIZE - s.limit, Deflater.SYNC_FLUSH)
          : deflater.deflate(s.data, s.limit, Segment.SIZE - s.limit);

      if (deflated > 0) {
        s.limit += deflated;
        buffer.size += deflated;
        //当链表中存在两个数量级则写入jdk的流时就开始将数据写进
        sink.emitCompleteSegments();
      } 
      /**
       * 假如已经将input中的数据都压缩完成了,input数据为空
       */
      else if (deflater.needsInput()) {
        if (s.pos == s.limit) {
          // We allocated a tail segment, but didn't end up needing it. Recycle!
          buffer.head = s.pop();
          SegmentPool.recycle(s);
        }
        return;
      }
    }

循环遍历 BufferedSink 里链表的数据填入Deflater 中,然后用 deflater.deflate(s.data, s.limit, Segment.SIZE - s.limit)方法将压缩后的数据填入 GzipSink的 链表中,即从把数据从一个链表域压缩之后转化为另一个链表域。那么转化完之后呢?继续跟进 gzipSink.close()

 @Override public void close() throws IOException {
    if (closed) return;

    // This method delegates to the DeflaterSink for finishing the deflate process
    // but keeps responsibility for releasing the deflater's resources. This is
    // necessary because writeFooter needs to query the processed byte count which
    // only works when the deflater is still open.

    Throwable thrown = null;
    try {
    	//结束压缩
      deflaterSink.finishDeflate();
      //写入
      writeFooter();
    } catch (Throwable e) {
      thrown = e;
    }

    try {
      deflater.end();
    } catch (Throwable e) {
      if (thrown == null) thrown = e;
    }

    try {
      sink.close();
    } catch (Throwable e) {
      if (thrown == null) thrown = e;
    }
    closed = true;

    if (thrown != null) Util.sneakyRethrow(thrown);
  }
这个方法最重要的两个方法是  deflaterSink.finishDeflate(),结束压缩,并做最后一次压缩的努力

 void finishDeflate() throws IOException {
    deflater.finish();
    //做最后一次压缩努力
    deflate(false);
  }

writeFooter()


  private void writeFooter() throws IOException {
	//将唯一值写入文件,用于读取数据时进行数据是否被篡改的效验
    sink.writeIntLe((int) crc.getValue()); // CRC of original data.
    //最后写入总共多少数据没有被压缩
    sink.writeIntLe((int) deflater.getBytesRead()); // Length of original data.
  }
这个方法用于记录一下,还有多少数据压缩失败,并且记录数据完整性标记用于读取时的验证。细心的小伙伴可能会发现在初始化 GzipSink的时候会先向链表中写入固定的头信息,这个有什么用,你猜一下应该也是用于读的时候的验证。

  private void writeHeader() {
    // Write the Gzip header directly into the buffer for the sink to avoid handling IOException.
    Buffer buffer = this.sink.buffer();
    buffer.writeShort(0x1f8b); // Two-byte Gzip ID.
    buffer.writeByte(0x08); // 8 == Deflate compression method.
    buffer.writeByte(0x00); // No flags.
    buffer.writeInt(0x00); // No modification time.
    buffer.writeByte(0x00); // No extra flags.
    buffer.writeByte(0x00); // No OS.
  }

关于到底要干什么,下一篇介绍读的时候会详细介绍


















你可能感兴趣的:(java/io流)