Java使用Zstandard压缩算法对字节数组进行压缩和解压缩

“不积跬步,无以至千里。”

介绍

最近在做项目的时候,因为项目部署在国外,又是公网的环境,流量费用很贵,所以需要使用Netty对传输的数据进行压缩和

解压缩,准备用Facebook的Zstandard试试性能,听说压缩性能好,还可以根据需求调整压缩比。

然而在网上搜索了一阵子发现,java通过zstd对字节数组做压缩和解压缩的例子很少,仅有的几个给出的API还都是错误的,更别说跑通了,就算使用Gpt3.5和4.0给的的示例也是错误的,所以干脆写篇文章记录一下

引入依赖

首先在pom文件中引入依赖,这里使用比较新的版本,2023/7/17的发布的版本,使用人数也比较多,应该不至于翻车

<dependency>
  <groupId>com.github.lubengroupId>
  <artifactId>zstd-jniartifactId>
  <version>1.5.5-5version>
dependency>

示例代码编写

import com.github.luben.zstd.Zstd;
import java.io.IOException;

public class ZstdTest {

    // 压缩比,默认为3,最大值为22,值越大,压缩比越高
    public static final int ZSTD_MAX_COMPRESSION_RATIO_LEVEL = 13;

    public static void main(String[] args) throws IOException {
        byte[] originData  = "Zstandard's format is stable and documented in RFC8878. Multiple independent implementations are already available. This repository represents the reference implementation, provided as an open-source dual BSD OR GPLv2 licensed C library, and a command line utility producing and decoding .zst, .gz, .xz and .lz4 files. Should your project require another programming language, a list of known ports and bindings is provided on Zstandard homepage.".getBytes();
        System.out.println("Origin size : " + originData.length);

        // Compress by zstd.
        byte[] compressedData = Zstd.compress(originData,ZSTD_MAX_COMPRESSION_RATIO_LEVEL);
        System.out.println("Compressed size : " + compressedData.length);

        // Decompress by zstd.
        byte[] decompressedData = new byte[(int) Zstd.decompressedSize(compressedData)];
        Zstd.decompress(decompressedData, compressedData);
        System.out.println("Origin data : " + new String(decompressedData));

        System.out.println("Max compression level: " + Zstd.maxCompressionLevel());
        System.out.println("Min compression level: " + Zstd.minCompressionLevel());
    }
}

运行

运行测试代码,发现压缩后的字节占比仅为原始数据大小70%不到,因为原始数据不大,所以表现不明显,如果原始数据大一点,再调高一些压缩等级,压缩率会很高

Origin size : 446
Compressed size : 306
Origin data : Zstandard's format is stable and documented in RFC8878. Multiple independent implementations are already available. This repository represents the reference implementation, provided as an open-source dual BSD OR GPLv2 licensed C library, and a command line utility producing and decoding .zst, .gz, .xz and .lz4 files. Should your project require another programming language, a list of known ports and bindings is provided on Zstandard homepage.
Max compression level: 22
Min compression level: -131072

Process finished with exit code 0

附上官方在一些CPU上的测试数据:

For reference, several fast compression algorithms were tested and compared on a desktop running Ubuntu 20.04 (Linux 5.11.0-41-generic), with a Core i7-9700K CPU @ 4.9GHz, using lzbench, an open-source in-memory benchmark by @inikep compiled with gcc 9.3.0, on the Silesia compression corpus.

Compressor name Ratio Compression Decompress.
zstd 1.5.1 -1 2.887 530 MB/s 1700 MB/s
zlib 1.2.11 -1 2.743 95 MB/s 400 MB/s
brotli 1.0.9 -0 2.702 395 MB/s 450 MB/s
zstd 1.5.1 --fast=1 2.437 600 MB/s 2150 MB/s
zstd 1.5.1 --fast=3 2.239 670 MB/s 2250 MB/s
quicklz 1.5.0 -1 2.238 540 MB/s 760 MB/s
zstd 1.5.1 --fast=4 2.148 710 MB/s 2300 MB/s
lzo1x 2.10 -1 2.106 660 MB/s 845 MB/s
lz4 1.9.3 2.101 740 MB/s 4500 MB/s
lzf 3.6 -1 2.077 410 MB/s 830 MB/s
snappy 1.1.9 2.073 550 MB/s 1750 MB/s

The negative compression levels, specified with --fast=#, offer faster compression and decompression speed at the cost of compression ratio (compared to level 1).

Zstd can also offer stronger compression ratios at the cost of compression speed. Speed vs Compression trade-off is configurable by small increments. Decompression speed is preserved and remains roughly the same at all settings, a property shared by most LZ compression algorithms, such as zlib or lzma.

你可能感兴趣的:(技术文档,java,算法)