2019-02-12差量更新--Google archive-patcher(差量文件生成)

项目地址:https://github.com/andrewhayden/archive-patcher

项目目录结构:
1.generator 差量生成模块
2.applier 差量应用模块


2019-02-12差量更新--Google archive-patcher(差量文件生成)_第1张图片
clipboard.png

程序入口做了点简单修改


2019-02-12差量更新--Google archive-patcher(差量文件生成)_第2张图片
clipboard.png

在磁盘目录下创建了这几个文件来做一下比较


2019-02-12差量更新--Google archive-patcher(差量文件生成)_第3张图片
clipboard.png

1.然后用beyon compare做比较,new.zip和old.zip就一目了然了


2019-02-12差量更新--Google archive-patcher(差量文件生成)_第4张图片
clipboard.png

2019-02-12差量更新--Google archive-patcher(差量文件生成)_第5张图片
clipboard.png

2.通过archive-patcher来生产差量文件patch1,用sublime打开是这样的
2019-02-12差量更新--Google archive-patcher(差量文件生成)_第6张图片
clipboard.png

3.接下来通过old.zip和patch1来生成new1.zip,然后比较新生成的new.zip和new1.zip是否相同


2019-02-12差量更新--Google archive-patcher(差量文件生成)_第7张图片
clipboard.png

可以看到是一模一样的

接下来看一看生成差量patch的源码
archive-patcher是先解析old.zip和new.zip文件,然后解压其中的文件进行一一比对生成差量友好文件;

差量文件生成

/**
 * Generate a V1 patch for the specified input files and write the patch to the specified {@link
 * OutputStream}. The written patch is raw, i.e. it has not been compressed. Compression
 * should almost always be applied to the patch, either right in the specified {@link
 * OutputStream} or in a post-processing step, prior to transmitting the patch to the patch
 * applier.
 *
 * @param oldFile the original old file to read (will not be modified)
 * @param newFile the original new file to read (will not be modified)
 * @param patchOut the stream to write the patch to
 * @throws IOException if unable to complete the operation due to an I/O error
 * @throws InterruptedException if any thread has interrupted the current thread
 */
@Override
public void generateDelta(File oldFile, File newFile, OutputStream patchOut)
    throws IOException, InterruptedException {
  try (TempFileHolder deltaFriendlyOldFile = new TempFileHolder();
      TempFileHolder deltaFriendlyNewFile = new TempFileHolder();
      TempFileHolder deltaFile = new TempFileHolder();
      FileOutputStream deltaFileOut = new FileOutputStream(deltaFile.file);
      BufferedOutputStream bufferedDeltaOut = new BufferedOutputStream(deltaFileOut)) {
    PreDiffExecutor.Builder builder =
        new PreDiffExecutor.Builder()
            .readingOriginalFiles(oldFile, newFile)
            .writingDeltaFriendlyFiles(deltaFriendlyOldFile.file, deltaFriendlyNewFile.file);
    for (RecommendationModifier modifier : recommendationModifiers) {
      builder.withRecommendationModifier(modifier);
    }
    PreDiffExecutor executor = builder.build();
    PreDiffPlan preDiffPlan = executor.prepareForDiffing();
    DeltaGenerator deltaGenerator = getDeltaGenerator();
    deltaGenerator.generateDelta(
        deltaFriendlyOldFile.file, deltaFriendlyNewFile.file, bufferedDeltaOut);
    bufferedDeltaOut.close();
    PatchWriter patchWriter =
        new PatchWriter(
            preDiffPlan,
            deltaFriendlyOldFile.file.length(),
            deltaFriendlyNewFile.file.length(),
            deltaFile.file);
    patchWriter.writeV1Patch(patchOut);
  }
}

1.生成三个临时文件,jvm退出时自动删除(旧压缩文件的差量友好文件,新压缩文件的差量友好文件,差量友好文件)

2.生成PreDiffPlan对象

3.通过PatchWriter生成patch文件

先看PreDiffExecutor通过prepareForDiffing()生成PreDiffPlan对象

/**
 * Prepare resources for diffing and returns the completed plan.
 *
 * @return the plan
 * @throws IOException if unable to complete the operation due to an I/O error
 */
public PreDiffPlan prepareForDiffing() throws IOException {
  PreDiffPlan preDiffPlan = generatePreDiffPlan();
  List> deltaFriendlyNewFileRecompressionPlan = null;
  if (deltaFriendlyOldFile != null) {
    // Builder.writingDeltaFriendlyFiles() ensures old and new are non-null when called, so a
    // check on either is sufficient.
    deltaFriendlyNewFileRecompressionPlan =
        Collections.unmodifiableList(generateDeltaFriendlyFiles(preDiffPlan));
  }
  return new PreDiffPlan(
      preDiffPlan.getQualifiedRecommendations(),
      preDiffPlan.getOldFileUncompressionPlan(),
      preDiffPlan.getNewFileUncompressionPlan(),
      deltaFriendlyNewFileRecompressionPlan);
}

/**
 * Analyze the original old and new files and generate a plan to transform them into their
 * delta-friendly equivalents.
 *
 * @return the plan, which does not yet contain information for recompressing the delta-friendly
 *     new archive.
 * @throws IOException if anything goes wrong
 */
private PreDiffPlan generatePreDiffPlan() throws IOException {
  Map originalOldArchiveZipEntriesByPath =
      new HashMap();
  Map originalNewArchiveZipEntriesByPath =
      new HashMap();
  Map originalNewArchiveJreDeflateParametersByPath =
      new HashMap();

  for (MinimalZipEntry zipEntry : MinimalZipArchive.listEntries(originalOldFile)) {
    ByteArrayHolder key = new ByteArrayHolder(zipEntry.getFileNameBytes());
    originalOldArchiveZipEntriesByPath.put(key, zipEntry);
  }

  DefaultDeflateCompressionDiviner diviner = new DefaultDeflateCompressionDiviner();
  for (DivinationResult divinationResult : diviner.divineDeflateParameters(originalNewFile)) {
    ByteArrayHolder key =
        new ByteArrayHolder(divinationResult.minimalZipEntry.getFileNameBytes());
    originalNewArchiveZipEntriesByPath.put(key, divinationResult.minimalZipEntry);
    originalNewArchiveJreDeflateParametersByPath.put(key, divinationResult.divinedParameters);
  }

  PreDiffPlanner preDiffPlanner =
      new PreDiffPlanner(
          originalOldFile,
          originalOldArchiveZipEntriesByPath,
          originalNewFile,
          originalNewArchiveZipEntriesByPath,
          originalNewArchiveJreDeflateParametersByPath,
          recommendationModifiers.toArray(new RecommendationModifier[] {}));
  return preDiffPlanner.generatePreDiffPlan();
}

ByteArrayHolder 重写了hashCode()和equals()
MinimalZipArchive.listEntries前面已经解析过了
下面看diviner.divineDeflateParameters

/**
 * Load the specified archive and attempt to divine deflate parameters for all entries within.
 * @param archiveFile the archive file to work on
 * @return a list of results for each entry in the archive, in file order (not central directory
 * order). There is exactly one result per entry, regardless of whether or not that entry is
 * compressed. Callers can filter results by checking
 * {@link MinimalZipEntry#getCompressionMethod()} to see if the result is or is not compressed,
 * and by checking whether a non-null {@link JreDeflateParameters} was obtained.
 * @throws IOException if unable to read or parse the file
 * @see DivinationResult 
 */
public List divineDeflateParameters(File archiveFile) throws IOException {
  List results = new ArrayList<>();
  for (MinimalZipEntry minimalZipEntry : MinimalZipArchive.listEntries(archiveFile)) {
    JreDeflateParameters divinedParameters = null;
    if (minimalZipEntry.isDeflateCompressed()) {
      // TODO(pasc): Reuse streams to avoid churning file descriptors
      MultiViewInputStreamFactory isFactory =
          new RandomAccessFileInputStreamFactory(
              archiveFile,
              minimalZipEntry.getFileOffsetOfCompressedData(),
              minimalZipEntry.getCompressedSize());

      // Keep small entries in memory to avoid unnecessary file I/O.
      if (minimalZipEntry.getCompressedSize() < (100 * 1024)) {
        try (InputStream is = isFactory.newStream()) {
          byte[] compressedBytes = new byte[(int) minimalZipEntry.getCompressedSize()];
          is.read(compressedBytes);
          divinedParameters =
              divineDeflateParameters(new ByteArrayInputStreamFactory(compressedBytes));
        } catch (Exception ignore) {
          divinedParameters = null;
        }
      } else {
        divinedParameters = divineDeflateParameters(isFactory);
      }
    }
    results.add(new DivinationResult(minimalZipEntry, divinedParameters));
  }
  return results;
}

1.解析zip文件结构获取到MinimalZipEntry

2.MinimalZipEntry是否是压缩的

3.通过MultiViewInputStreamFactory重写创建流

4.如果压缩的文件大小小于100k,直接在读取到内存中

5.根据压缩的文件流获取到压缩参数JreDeflateParameters

下面看JreDeflateParameters的获取过程

/**
 * Determines the original {@link JreDeflateParameters} that were used to compress a given piece
 * of deflated delivery.
 *
 * @param compressedDataInputStreamFactory a {@link MultiViewInputStreamFactory} that can provide
 *     multiple independent {@link InputStream} instances for the compressed delivery.
 * @return the parameters that can be used to replicate the compressed delivery in the {@link
 *     DefaultDeflateCompatibilityWindow}, if any; otherwise null. Note that 
 *     null is also returned in the case of corrupt zip delivery since, by definition,
 *     it cannot be replicated via any combination of normal deflate parameters.
 * @throws IOException if there is a problem reading the delivery, i.e. if the file contents are
 *     changed while reading
 */
public JreDeflateParameters divineDeflateParameters(
    MultiViewInputStreamFactory compressedDataInputStreamFactory) throws IOException {
  byte[] copyBuffer = new byte[32 * 1024];
  // Iterate over all relevant combinations of nowrap, strategy and level.
  for (boolean nowrap : new boolean[] {true, false}) {
    Inflater inflater = new Inflater(nowrap);
    Deflater deflater = new Deflater(0, nowrap);

    strategy_loop:
    for (int strategy : new int[] {0, 1, 2}) {
      deflater.setStrategy(strategy);
      for (int level : LEVELS_BY_STRATEGY.get(strategy)) {
        deflater.setLevel(level);
        inflater.reset();
        deflater.reset();
        try {
          if (matches(inflater, deflater, compressedDataInputStreamFactory, copyBuffer)) {
            end(inflater, deflater);
            return JreDeflateParameters.of(level, strategy, nowrap);
          }
        } catch (ZipException e) {
          // Parse error in input. The only possibilities are corruption or the wrong nowrap.
          // Skip all remaining levels and strategies.
          break strategy_loop;
        }
      }
    }
    end(inflater, deflater);
  }
  return null;
}

/**
 * Checks whether the specified deflater will produce the same compressed delivery as the byte
 * stream.
 *
 * @param inflater the inflater for uncompressing the stream
 * @param deflater the deflater for recompressing the output of the inflater
 * @param copyBuffer buffer to use for copying bytes between the inflater and the deflater
 * @return true if the specified deflater reproduces the bytes in compressedDataIn, otherwise
 *     false
 * @throws IOException if anything goes wrong; in particular, {@link ZipException} is thrown if
 *     there is a problem parsing compressedDataIn
 */
private boolean matches(
    Inflater inflater,
    Deflater deflater,
    MultiViewInputStreamFactory compressedDataInputStreamFactory,
    byte[] copyBuffer)
    throws IOException {

  try (MatchingOutputStream matcher =
          new MatchingOutputStream(
              compressedDataInputStreamFactory.newStream(), copyBuffer.length);
      InflaterInputStream inflaterIn =
          new InflaterInputStream(
              compressedDataInputStreamFactory.newStream(), inflater, copyBuffer.length);
      DeflaterOutputStream out = new DeflaterOutputStream(matcher, deflater, copyBuffer.length)) {
    int numRead;
    while ((numRead = inflaterIn.read(copyBuffer)) >= 0) {
      out.write(copyBuffer, 0, numRead);
    }
    // When done, all bytes have been successfully recompressed. For sanity, check that
    // the matcher has consumed the same number of bytes and arrived at EOF as well.
    out.finish();
    out.flush();
    matcher.expectEof();
    // At this point the delivery in the compressed output stream was a perfect match for the
    // delivery in the compressed input stream; the answer has been found.
    return true;
  } catch (MismatchException e) {
    // Fast-fail case when the compressed output stream doesn't match the compressed input
    // stream. These are not the parameters you're looking for!
    return false;
  }
}

通过循环获取到了这三个参数
nowrap, strategy and level.
解压数据,再通过调整这三个数值重新压缩数据跟zip中的压缩数据比较是否相等,如果相等找到这三个值

回头再看preDiffPlanner.generatePreDiffPlan()

/**
 * Generates and returns the plan for archive transformations to be made prior to differencing.
 * The resulting {@link PreDiffPlan} has the old and new file uncompression plans set. The
 * delta-friendly new file recompression plan is not set at this time.
 * @return the plan
 * @throws IOException if there are any problems reading the input files
 */
PreDiffPlan generatePreDiffPlan() throws IOException {
  List recommendations = getDefaultRecommendations();
  for (RecommendationModifier modifier : recommendationModifiers) {
    // Allow changing the recommendations base on arbitrary criteria.
    recommendations = modifier.getModifiedRecommendations(oldFile, newFile, recommendations);
  }

  // Process recommendations to extract ranges for decompression & recompression
  Set> oldFilePlan = new HashSet<>();
  Set> newFilePlan = new HashSet<>();
  for (QualifiedRecommendation recommendation : recommendations) {
    if (recommendation.getRecommendation().uncompressOldEntry) {
      long offset = recommendation.getOldEntry().getFileOffsetOfCompressedData();
      long length = recommendation.getOldEntry().getCompressedSize();
      TypedRange range = new TypedRange(offset, length, null);
      oldFilePlan.add(range);
    }
    if (recommendation.getRecommendation().uncompressNewEntry) {
      long offset = recommendation.getNewEntry().getFileOffsetOfCompressedData();
      long length = recommendation.getNewEntry().getCompressedSize();
      JreDeflateParameters newJreDeflateParameters =
          newArchiveJreDeflateParametersByPath.get(
              new ByteArrayHolder(recommendation.getNewEntry().getFileNameBytes()));
      TypedRange range =
          new TypedRange(offset, length, newJreDeflateParameters);
      newFilePlan.add(range);
    }
  }

  List> oldFilePlanList = new ArrayList<>(oldFilePlan);
  Collections.sort(oldFilePlanList);
  List> newFilePlanList = new ArrayList<>(newFilePlan);
  Collections.sort(newFilePlanList);
  return new PreDiffPlan(
      Collections.unmodifiableList(recommendations),
      Collections.unmodifiableList(oldFilePlanList),
      Collections.unmodifiableList(newFilePlanList));
}

获取默认的推荐解压项

/**
 * Analyzes the input files and returns the default recommendations for each entry in the new
 * archive.
 *
 * @return the recommendations
 * @throws IOException if anything goes wrong
 */
private List getDefaultRecommendations() throws IOException {
  List recommendations = new ArrayList<>();

  // This will be used to find files that have been renamed, but not modified. This is relatively
  // cheap to construct as it just requires indexing all entries by the uncompressed CRC32, and
  // the CRC32 is already available in the ZIP headers.
  SimilarityFinder trivialRenameFinder =
      new Crc32SimilarityFinder(oldFile, oldArchiveZipEntriesByPath.values());

  // Iterate over every pair of entries and get a recommendation for what to do.
  for (Map.Entry newEntry :
      newArchiveZipEntriesByPath.entrySet()) {
    ByteArrayHolder newEntryPath = newEntry.getKey();
    MinimalZipEntry oldZipEntry = oldArchiveZipEntriesByPath.get(newEntryPath);
    if (oldZipEntry == null) {
      // The path is only present in the new archive, not in the old archive. Try to find a
      // similar file in the old archive that can serve as a diff base for the new file.
      List identicalEntriesInOldArchive =
          trivialRenameFinder.findSimilarFiles(newFile, newEntry.getValue());
      if (!identicalEntriesInOldArchive.isEmpty()) {
        // An identical file exists in the old archive at a different path. Use it for the
        // recommendation and carry on with the normal logic.
        // All entries in the returned list are identical, so just pick the first one.
        // NB, in principle it would be optimal to select the file that required the least work
        // to apply the patch - in practice, it is unlikely that an archive will contain multiple
        // copies of the same file that are compressed differently, so don't bother with that
        // degenerate case.
        oldZipEntry = identicalEntriesInOldArchive.get(0);
      }
    }

    // If the attempt to find a suitable diff base for the new entry has failed, oldZipEntry is
    // null (nothing to do in that case). Otherwise, there is an old entry that is relevant, so
    // get a recommendation for what to do.
    if (oldZipEntry != null) {
      recommendations.add(getRecommendation(oldZipEntry, newEntry.getValue()));
    }
  }
  return recommendations;
}

比较old.zip和new.zip的文件结构,比较crc32 来获取类似的文件项以此来做差量比较推荐项

/**
 * Determines the right {@link QualifiedRecommendation} for handling the (oldEntry, newEntry)
 * tuple.
 * @param oldEntry the entry in the old archive
 * @param newEntry the entry in the new archive
 * @return the recommendation
 * @throws IOException if there are any problems reading the input files
 */
private QualifiedRecommendation getRecommendation(MinimalZipEntry oldEntry, MinimalZipEntry newEntry)
    throws IOException {

  // Reject anything that is unsuitable for uncompressed diffing.
  // Reason singled out in order to monitor unsupported versions of zlib.
  if (unsuitableDeflate(newEntry)) {
    return new QualifiedRecommendation(
        oldEntry,
        newEntry,
        Recommendation.UNCOMPRESS_NEITHER,
        RecommendationReason.DEFLATE_UNSUITABLE);
  }

  // Reject anything that is unsuitable for uncompressed diffing.
  if (unsuitable(oldEntry, newEntry)) {
    return new QualifiedRecommendation(
        oldEntry,
        newEntry,
        Recommendation.UNCOMPRESS_NEITHER,
        RecommendationReason.UNSUITABLE);
  }

  // If both entries are already uncompressed there is nothing to do.
  if (bothEntriesUncompressed(oldEntry, newEntry)) {
    return new QualifiedRecommendation(
        oldEntry,
        newEntry,
        Recommendation.UNCOMPRESS_NEITHER,
        RecommendationReason.BOTH_ENTRIES_UNCOMPRESSED);
  }

  // The following are now true:
  // 1. At least one of the entries is compressed.
  // 1. The old entry is either uncompressed, or is compressed with deflate.
  // 2. The new entry is either uncompressed, or is reproducibly compressed with deflate.

  if (uncompressedChangedToCompressed(oldEntry, newEntry)) {
    return new QualifiedRecommendation(
        oldEntry,
        newEntry,
        Recommendation.UNCOMPRESS_NEW,
        RecommendationReason.UNCOMPRESSED_CHANGED_TO_COMPRESSED);
  }

  if (compressedChangedToUncompressed(oldEntry, newEntry)) {
    return new QualifiedRecommendation(
        oldEntry,
        newEntry,
        Recommendation.UNCOMPRESS_OLD,
        RecommendationReason.COMPRESSED_CHANGED_TO_UNCOMPRESSED);
  }

  // At this point, both entries must be compressed with deflate.
  if (compressedBytesChanged(oldEntry, newEntry)) {
    return new QualifiedRecommendation(
        oldEntry,
        newEntry,
        Recommendation.UNCOMPRESS_BOTH,
        RecommendationReason.COMPRESSED_BYTES_CHANGED);
  }

  // If the compressed bytes have not changed, there is no need to do anything.
  return new QualifiedRecommendation(
      oldEntry,
      newEntry,
      Recommendation.UNCOMPRESS_NEITHER,
      RecommendationReason.COMPRESSED_BYTES_IDENTICAL);
}

根据oldEntry和newEntry的是否压缩来做出不同的推荐策略
主要有7种类型:

1.该文件被压缩过,但是无法推测出其JreDeflateParamyaseters参数,也就是无法获得其压缩级别,编码策略,nowrap三个参数,没有了这三个参数,我们就无法重新进行压缩,因此,对于这种情况,返回的是不建议解压,原因是找不到合适的deflate参数还原压缩数据

2.旧文件,或新文件被压缩了,但是是不支持的压缩算法,则返回不建议解压缩,原因是使用了不支持的压缩算法

3.如果新旧文件都没有被压缩,则返回不需要解压,原因是都没有被压缩

4.如果旧文件未压缩,新文件已压缩,则返回新文件需要解压,原因是从未压缩文件
变成了已压缩文件

5.如果旧文件已压缩,新文件未压缩,则返回旧文件需要解压,原因是从已压缩文件变成了未压缩文件

6.如果新旧文件都已经压缩,且发生了变化,则返回需要解压新旧文件,原因是文件发生改变

7.没有新旧文件没有发生变化,则返回不需要解压新旧文件,原因是文件未发生改变
(疑问:zip文件有总的压缩级别、编码策略、nowrap?zip文件中单个Entry有各自的压缩级别、编码策略、nowrap?大概是。。。)

再看差量友好文件是如何生成的:

/**
 * Generate the delta-friendly files and return the plan for recompressing the delta-friendly new
 * file back into the original new file.
 *
 * @param preDiffPlan the plan to execute
 * @return as described
 * @throws IOException if anything goes wrong
 */
private List> generateDeltaFriendlyFiles(PreDiffPlan preDiffPlan)
    throws IOException {
  try (FileOutputStream out = new FileOutputStream(deltaFriendlyOldFile);
      BufferedOutputStream bufferedOut = new BufferedOutputStream(out)) {
    DeltaFriendlyFile.generateDeltaFriendlyFile(
        preDiffPlan.getOldFileUncompressionPlan(), originalOldFile, bufferedOut);
  }
  try (FileOutputStream out = new FileOutputStream(deltaFriendlyNewFile);
      BufferedOutputStream bufferedOut = new BufferedOutputStream(out)) {
    return DeltaFriendlyFile.generateDeltaFriendlyFile(
        preDiffPlan.getNewFileUncompressionPlan(), originalNewFile, bufferedOut);
  }
}

/**
 * Invoke {@link #generateDeltaFriendlyFile(List, File, OutputStream, boolean, int)} with 
 * generateInverse set to true and a copy buffer size of {@link
 * #DEFAULT_COPY_BUFFER_SIZE}.
 *
 * @param  the type of the data associated with the ranges
 * @param rangesToUncompress the ranges to be uncompressed during transformation to a
 *     delta-friendly form
 * @param file the file to read from
 * @param deltaFriendlyOut a stream to write the delta-friendly file to
 * @return the ranges in the delta-friendly file that correspond to the ranges in the original
 *     file, with identical metadata and in the same order
 * @throws IOException if anything goes wrong
 */
public static  List> generateDeltaFriendlyFile(
    List> rangesToUncompress, File file, OutputStream deltaFriendlyOut)
    throws IOException {
  return generateDeltaFriendlyFile(
      rangesToUncompress, file, deltaFriendlyOut, true, DEFAULT_COPY_BUFFER_SIZE);
}
/**
 * Generate one delta-friendly file and (optionally) return the ranges necessary to invert the
 * transform, in file order. There is a 1:1 correspondence between the ranges in the input list
 * and the returned list, but the offsets and lengths will be different (the input list represents
 * compressed data, the output list represents uncompressed data). The ability to suppress
 * generation of the inverse range and to specify the size of the copy buffer are provided for
 * clients that desire a minimal memory footprint.
 *
 * @param  the type of the data associated with the ranges
 * @param rangesToUncompress the ranges to be uncompressed during transformation to a
 *     delta-friendly form
 * @param file the file to read from
 * @param deltaFriendlyOut a stream to write the delta-friendly file to
 * @param generateInverse if true, generate and return a list of inverse ranges in
 *     file order; otherwise, do all the normal work but return null instead of the inverse ranges
 * @param copyBufferSize the size of the buffer to use for copying bytes between streams
 * @return if generateInverse was true, returns the ranges in the delta-friendly file
 *     that correspond to the ranges in the original file, with identical metadata and in the same
 *     order; otherwise, return null
 * @throws IOException if anything goes wrong
 */
public static  List> generateDeltaFriendlyFile(
    List> rangesToUncompress,
    File file,
    OutputStream deltaFriendlyOut,
    boolean generateInverse,
    int copyBufferSize)
    throws IOException {
  List> inverseRanges = null;
  if (generateInverse) {
    inverseRanges = new ArrayList>(rangesToUncompress.size());
  }
  long lastReadOffset = 0;
  RandomAccessFileInputStream oldFileRafis = null;
  PartiallyUncompressingPipe filteredOut =
      new PartiallyUncompressingPipe(deltaFriendlyOut, copyBufferSize);
  try {
    oldFileRafis = new RandomAccessFileInputStream(file);
    for (TypedRange rangeToUncompress : rangesToUncompress) {
      long gap = rangeToUncompress.getOffset() - lastReadOffset;
      if (gap > 0) {
        // Copy bytes up to the range start point
        oldFileRafis.setRange(lastReadOffset, gap);
        filteredOut.pipe(oldFileRafis, PartiallyUncompressingPipe.Mode.COPY);
      }

      // Now uncompress the range.
      oldFileRafis.setRange(rangeToUncompress.getOffset(), rangeToUncompress.getLength());
      long inverseRangeStart = filteredOut.getNumBytesWritten();
      // TODO(andrewhayden): Support nowrap=false here? Never encountered in practice.
      // This would involve catching the ZipException, checking if numBytesWritten is still zero,
      // resetting the stream and trying again.
      filteredOut.pipe(oldFileRafis, PartiallyUncompressingPipe.Mode.UNCOMPRESS_NOWRAP);
      lastReadOffset = rangeToUncompress.getOffset() + rangeToUncompress.getLength();

      if (generateInverse) {
        long inverseRangeEnd = filteredOut.getNumBytesWritten();
        long inverseRangeLength = inverseRangeEnd - inverseRangeStart;
        TypedRange inverseRange =
            new TypedRange(
                inverseRangeStart, inverseRangeLength, rangeToUncompress.getMetadata());
        inverseRanges.add(inverseRange);
      }
    }
    // Finish the final bytes of the file
    long bytesLeft = oldFileRafis.length() - lastReadOffset;
    if (bytesLeft > 0) {
      oldFileRafis.setRange(lastReadOffset, bytesLeft);
      filteredOut.pipe(oldFileRafis, PartiallyUncompressingPipe.Mode.COPY);
    }
  } finally {
    try {
      oldFileRafis.close();
    } catch (Exception ignored) {
      // Nothing
    }
    try {
      filteredOut.close();
    } catch (Exception ignored) {
      // Nothing
    }
  }
  return inverseRanges;
}

/**
 * Pipes the entire contents of the specified {@link InputStream} to the configured
 * {@link OutputStream}, optionally uncompressing on-the-fly.
 * @param in the stream to read from
 * @param mode the mode to use for reading and writing
 * @return the number of bytes written to the output stream
 * @throws IOException if anything goes wrong
 */
public long pipe(InputStream in, Mode mode) throws IOException {
  long bytesWrittenBefore = out.getNumBytesWritten();
  if (mode == Mode.COPY) {
    int numRead = 0;
    while ((numRead = in.read(copyBuffer)) >= 0) {
      out.write(copyBuffer, 0, numRead);
    }
  } else {
    uncompressor.setNowrap(mode == Mode.UNCOMPRESS_NOWRAP);
    uncompressor.uncompress(in, out);
  }
  out.flush();
  return out.getNumBytesWritten() - bytesWrittenBefore;
}

这个函数相对来说还比较好理解:
差量友好的意思就是把可以解压的都解压出来输出到差量友好文件中,
具体怎么理解这个过程呢?
前面介绍的Entry中前部分不是有描述符LOCAL_ENTRY_SIGNATURE + 一堆的描述属性,这部分是不需要解压的,也就是rangeToUncompress.getOffset()偏移量之前的数据,
直接拷贝,压缩数据部分是需要解压的,解压后输出,剩下的部分数据(也就是Central directory和End of central directory record)直接copy就行了。

old.zip和new.zip生成差量友好文件流程一样。

有了old.zip、new.zip的差量友好文件通过BsDiff算法(等会讲)生成差量文件patch。
先看一下patch的文件结构:

Offset Bytes Description 备注
0 8 Versioned Identifier 头部标记,固定值”GFbFv1_0”,UTF-8字符串
8 4 Flags (currently unused, but reserved) 标记未,预留
12 8 Delta-friendly old archive size 旧文件差量友好文件大小,64位无符号整型
20 4 Num old archive uncompression ops 旧文件待解压文件个数,32位无符号整型
24 i Old archive uncompression op 1…n 旧文件待解压文件的偏移和大小,总共n个
24+i 4 Num new archive recompression ops 新文件待压缩文件个数,32位无符号整型
24+i+4 j New archive recompression op 1…n 新文件待压缩文件的偏移和大小,总共n个
24+i+4+j 4 Num delta descriptor records 新文件差量描述个数,32位无符号整型
24+i+4+j+4 k Delta descriptor record 1…n 差量算法描述记录,总共n个
24+i+4+j+4+k l Delta 1…n 差量算法描述

Old Archive Uncompression Op的数据结构如下

Bytes Description 备注
8 Offset of first byte to uncompress 待解压的偏移位置,64位无符号整型
8 Number of bytes to uncompress 待解压的字节个数,64位无符号整型

New Archive Recompression Op的数据结构如下

Bytes Description 备注
8 Offset of first byte to compress 待压缩的偏移位置,64位无符号整型
8 Number of bytes to compress 待压缩的字节个数,64位无符号整型
4 Compression settings 压缩参数,即压缩级别,编码策略,nowrap

Compression Settings的数据结构如下

Bytes Description 备注
1 Compatibility window ID 兼容窗口,当前取值为0,即默认兼容窗口
1 Deflate level 压缩级别,取值[1,9]
1 Deflate strategy 编码策略,取值[0,2]
1 Wrap mode 取值0=wrap,1=nowrap

有点晕,一环套一环。。。
Compatibility Window即兼容窗口,其默认的兼容窗口ID取值为0,默认兼容窗口使用如下配置
使用deflate算法进行压缩(zlib)
32768个字节的buffer大小
已经被验证的压缩级别,1-9
已经被验证过的编码策略,0-2
已经被验证过的wrap模式,wrap和nowrap
默认兼容窗口可以兼容Android4.0之后的系统。
这个兼容窗口是怎么得到的呢,其中有一个类叫DefaultDeflateCompatibilityWindow,可以调用getIncompatibleValues获得其不兼容的参数列表JreDeflateParameters(压缩级别,编码策略,nowrap的承载体),内部通过排列组合这三个参数,对一段内容进行压缩,产生压缩后的数据的16进制的编码,与内置的预期数据进行对比,如果相同则表示兼容,不相同表示不兼容。

Delta Descriptor Record用于描述差量算法,在当前的V1版Patch中,只有BsDiff算法,因此只有一条该数据结构,其数据结构如下:

Bytes Description 备注
1 Delta format ID 差量算法对应的枚举id,bsdiff取值0
8 Old delta-friendly region start 旧文件差量算法应用的偏移位置
8 Old delta-friendly region length 旧文件差量算法应用的长度
8 New delta-friendly region start 新文件差量算法应用的偏移位置
8 New delta-friendly region length 新文件差量算法应用的长度
8 Delta length 生成的差量文件的长度

看输出文件patch是如何写的吧:

/**
 * Write a v1-style patch to the specified output stream.
 * @param out the stream to write the patch to
 * @throws IOException if anything goes wrong
 */
public void writeV1Patch(OutputStream out) throws IOException {
  // Use DataOutputStream for ease of writing. This is deliberately left open, as closing it would
  // close the output stream that was passed in and that is not part of the method's documented
  // behavior.
  @SuppressWarnings("resource")
  DataOutputStream dataOut = new DataOutputStream(out);

  dataOut.write(PatchConstants.IDENTIFIER.getBytes("US-ASCII"));
  dataOut.writeInt(0); // Flags (reserved)
  dataOut.writeLong(deltaFriendlyOldFileSize);

  // Write out all the delta-friendly old file uncompression instructions
  dataOut.writeInt(plan.getOldFileUncompressionPlan().size());
  for (TypedRange range : plan.getOldFileUncompressionPlan()) {
    dataOut.writeLong(range.getOffset());
    dataOut.writeLong(range.getLength());
  }

  // Write out all the delta-friendly new file recompression instructions
  dataOut.writeInt(plan.getDeltaFriendlyNewFileRecompressionPlan().size());
  for (TypedRange range : plan.getDeltaFriendlyNewFileRecompressionPlan()) {
    dataOut.writeLong(range.getOffset());
    dataOut.writeLong(range.getLength());
    // Write the deflate information
    dataOut.write(PatchConstants.CompatibilityWindowId.DEFAULT_DEFLATE.patchValue);
    dataOut.write(range.getMetadata().level);
    dataOut.write(range.getMetadata().strategy);
    dataOut.write(range.getMetadata().nowrap ? 1 : 0);
  }

  // Now the delta section
  // First write the number of deltas present in the patch. In v1, there is always exactly one
  // delta, and it is for the entire input; in future versions there may be multiple deltas, of
  // arbitrary types.
  dataOut.writeInt(1);
  // In v1 the delta format is always bsdiff, so write it unconditionally.
  dataOut.write(PatchConstants.DeltaFormat.BSDIFF.patchValue);

  // Write the working ranges. In v1 these are always the entire contents of the delta-friendly
  // old file and the delta-friendly new file. These are for forward compatibility with future
  // versions that may allow deltas of arbitrary formats to be mapped to arbitrary ranges.
  dataOut.writeLong(0); // i.e., start of the working range in the delta-friendly old file
  dataOut.writeLong(deltaFriendlyOldFileSize); // i.e., length of the working range in old
  dataOut.writeLong(0); // i.e., start of the working range in the delta-friendly new file
  dataOut.writeLong(deltaFriendlyNewFileSize); // i.e., length of the working range in new

  // Finally, the length of the delta and the delta itself.
  dataOut.writeLong(deltaFile.length());
  try (FileInputStream deltaFileIn = new FileInputStream(deltaFile);
      BufferedInputStream deltaIn = new BufferedInputStream(deltaFileIn)) {
    byte[] buffer = new byte[32768];
    int numRead = 0;
    while ((numRead = deltaIn.read(buffer)) >= 0) {
      dataOut.write(buffer, 0, numRead);
    }
  }
  dataOut.flush();
}
  1. 8byte 写了头部标志”GFbFv1_0”
  2. 4byte Int写了标志位
  3. 8byte long Delta-friendly old archive size
  4. 4byte int Num old archive uncompression ops 旧文件待解压文件个数,32位无符号整型即下面的5中的i
  5. i*(8+8)byte 一个for循环Old archive uncompression op 1…i 旧文件待解压文件的偏移和大小,总共i个
  6. 4byte int Num new archive recompression ops 新文件待压缩文件个数 j,32位无符号整型
  7. j*(8+8+1+4+4+1)byte New archive recompression op 1…j 新文件待压缩文件的偏移、大小、patchValue、level、strategy、nowrap,总共j个
  8. 4byte Num delta descriptor records 现在v1版本只有一个
  9. 1byte 差量文件算法描述
  10. 8byte start of the working range in the delta-friendly old file
  11. 8byte length of the working range in old
  12. 8byte start of the working range in the delta-friendly new file
  13. 8byte length of the working range in new
  14. 8byte the length of the delta
  15. 最后copy差量文件数据deltaFile到patch

你可能感兴趣的:(2019-02-12差量更新--Google archive-patcher(差量文件生成))