I was especially interested how well LZMA compression would fit in
In both uses the files are compressed on one computer and decompressed many times by users around the world. In practice the most important factors are:
Less important:
Despite the many common factors, the contents of binary packages and source tarballs are quite different. Binary packages primarily contain executables and libraries while source tarballs contain mostly ASCII text of some programming language. Naturally both contain data files used by the program and (hopefully) some documentation.
Tests were run on a laptop:
bzip2 has two compression modes, one for normal use and another designed for small memory footprint (which can be invoked with 'bzip2 --small'). Only the normal mode was tested because it's faster.
Times are from the output of the command 'time' (line 'user') and rounded. Because of this, the compression and decompression time and speed tables should be taken as suggestive and not as the absolute truth. In practice, the bigger test files should be more reliable in terms of speed comparison.
When reading the tables, it is important to keep in mind which settings are the default in each program:
Note: The first column with numbers 1..9 indicates the compression setting passed to gzip, bzip2 and lzmash (e.g. "gzip -9").
Uncompressed size: 212664320 bytes (203 MB)
Compressed file size in bytes gzip bzip2 lzmash lzmash -e 1 86322815 76147880 67456213 - 2 84858575 74320824 62085798 - 3 83561997 73467586 59547691 59278372 4 81312776 73044026 58245872 57964166 5 79798262 72762041 56694215 56411631 6 79179298 72540199 56182079 55859514 7 78995264 72512833 55535273 55269226 8 78816280 72314472 54678948 54405078 9 78768334 72223858 54068819 53769958 Compressed size / Uncompressed size * 100% gzip bzip2 lzmash lzmash -e 1 40,6% 35,8% 31,7% - 2 39,9% 34,9% 29,2% - 3 39,3% 34,5% 28,0% 27,9% 4 38,2% 34,3% 27,4% 27,3% 5 37,5% 34,2% 26,7% 26,5% 6 37,2% 34,1% 26,4% 26,3% 7 37,1% 34,1% 26,1% 26,0% 8 37,1% 34,0% 25,7% 25,6% 9 37,0% 34,0% 25,4% 25,3% Compression time gzip bzip2 lzmash lzmash -e 1 11.5s 1m 26s 0m 58s - 2 12.0s 1m 40s 2m 7s - 3 13.7s 1m 54s 4m 58s 7m 37s 4 15.1s 2m 5s 5m 26s 8m 2s 5 18.4s 2m 11s 6m 47s 11m 18s 6 24.5s 2m 18s 7m 30s 12m 4s 7 29.4s 2m 25s 8m 24s 12m 59s 8 45.5s 2m 32s 10m 59s 20m 17s 9 66.9s 2m 37s 12m 20s 21m 56s Decompression time gzip bzip2 lzmash lzmash -e 1 3.3s 16.5s 11.3s - 2 3.3s 24.2s 10.5s - 3 3.3s 29.2s 10.5s 10.4s 4 3.3s 32.1s 10.4s 10.3s 5 3.2s 34.2s 10.2s 10.2s 6 3.2s 35.4s 10.2s 10.1s 7 3.2s 36.5s 10.1s 10.0s 8 3.2s 37.5s 10.0s 9.9s 9 3.1s 38.2s 10.0s 9.9s Compression speed, MB/s of uncompressed data (1 MB = 1024 * 1024 bytes) gzip bzip2 lzmash lzmash -e 1 18 2.4 3.5 - 2 17 2.0 1.6 - 3 15 1.8 0.68 0.44 4 13 1.6 0.62 0.42 5 11 1.5 0.50 0.30 6 8.3 1.5 0.45 0.28 7 6.9 1.4 0.40 0.26 8 4.5 1.3 0.31 0.17 9 3.0 1.3 0.27 0.15 Decompression speed, MB/s of uncompressed data (1 MB = 1024 * 1024 bytes) gzip bzip2 lzmash lzmash -e 1 61 12 18 - 2 61 8.4 19 - 3 61 6.9 19 20 4 61 6.3 20 20 5 63 5.9 20 20 6 63 5.7 20 20 7 63 5.6 20 20 8 63 5.4 20 20 9 65 5.3 20 20
Uncompressed size: 208250880 bytes (199 MB)
Compressed file size in bytes gzip bzip2 lzmash lzmash -e 1 57860603 43873922 43933138 - 2 55274813 41108704 38871392 - 3 53416918 39791569 34863499 34823465 4 49695438 39040694 33545762 33513509 5 47775348 38395197 32481024 32445716 6 47004031 37975094 31686173 31661947 7 46797152 37676593 30881464 30841602 8 46578138 37365408 30295730 30261027 9 46578138 37075679 29809336 29780803 Compressed size / Uncompressed size * 100% gzip bzip2 1 27,8% 21,1% 21,1% - 2 26,5% 19,7% 18,7% - 3 25,7% 19,1% 16,7% 16,7% 4 23,9% 18,7% 16,1% 16,1% 5 22,9% 18,4% 15,6% 15,6% 6 22,6% 18,2% 15,2% 15,2% 7 22,5% 18,1% 14,8% 14,8% 8 22,4% 17,9% 14,5% 14,5% 9 22,4% 17,8% 14,3% 14,3% Compression time gzip bzip2 lzmash lzmash -e 1 8.3s 1m 9s 0m 45s - 2 8.7s 1m 22s 1m 45s - 3 9.8s 1m 34s 5m 10s 8m 43s 4 11.1s 1m 45s 5m 43s 9m 41s 5 13.8s 1m 57s 7m 39s 14m 38s 6 17.8s 2m 2s 8m 23s 15m 32s 7 20.7s 2m 11s 9m 11s 16m 23s 8 29.7s 2m 21s 11m 34s 24m 47s 9 40.9s 2m 26s 12m 31s 25m 53s Decompression time gzip bzip2 lzmash lzmash -e 1 2.8s 12.8s 7.7s - 2 2.7s 19.4s 6.9s - 3 2.6s 23.8s 6.4s 6.6s 4 2.5s 26.4s 6.3s 6.3s 5 2.5s 28.3s 6.3s 6.3s 6 2.4s 29.6s 6.2s 6.3s 7 2.4s 30.6s 6.2s 6.2s 8 2.4s 31.3s 6.1s 6.1s 9 2.4s 32.1s 6.1s 6.1s Compression speed, MB/s of uncompressed data (1 MB = 1024 * 1024 bytes) gzip bzip2 lzmash lzmash -e 1 24 2.9 4.4 - 2 23 2.4 1.9 - 3 20 2.1 0.64 0.38 4 18 1.9 0.58 0.34 5 14 1.7 0.43 0.23 6 11 1.6 0.39 0.21 7 9.6 1.5 0.36 0.20 8 6.7 1.4 0.29 0.13 9 4.9 1.4 0.26 0.13 Decompression speed, MB/s of uncompressed data (1 MB = 1024 * 1024 bytes) gzip bzip2 lzmash lzmash -e 1 71 16 26 2 74 10 29 3 76 8.3 31 30 4 79 7.5 32 32 5 79 7.0 32 32 6 83 6.7 32 32 7 83 6.5 32 32 8 83 6.3 33 33 9 83 6.2 33 33
In this test bzip2 is a tough adversary to lzmash in fast modes. "lzmash -e" makes a few kB smaller files with the expense of a lot longer compression time.
XMMS 1.2.10 binary package (xmms-1.2.10-i486-2.tgz) from Slackware 10.1. The file was first gunzipped, resulting uncompressed size of 5498880 bytes (5.2 MB).
Compressed file size in bytes gzip bzip2 lzmash lzmash -e 1 2160102 1803573 1431699 - 2 2112332 1611408 1140030 - 3 2072044 1539083 1034903 1038615 4 2031519 1487237 1004176 1007692 5 1992713 1464332 987189 988758 6 1979068 1433617 983305 983198 7 1973404 1431276 982125 983240 8 1972424 1414142 980836 983582 9 1970643 1385112 980836 983582 Compressed size / Uncompressed size * 100% gzip bzip2 lzmash lzmash -e 1 39,3% 32,8% 26,0% - 2 38,4% 29,3% 20,7% - 3 37,7% 28,0% 18,8% 18,9% 4 36,9% 27,0% 18,3% 18,3% 5 36,2% 26,6% 18,0% 18,0% 6 36,0% 26,1% 17,9% 17,9% 7 35,9% 26,0% 17,9% 17,9% 8 35,9% 25,7% 17,8% 17,9% 9 35,8% 25,2% 17,8% 17,9% Compression time gzip bzip2 lzmash lzmash -e 1 0.3s 2.4s 1.4s - 2 0.3s 2.9s 2.7s - 3 0.4s 3.2s 6.2s 8.9s 4 0.4s 3.3s 6.6s 9.3s 5 0.5s 4.6s 8.2s 13.3s 6 0.7s 5.6s 8.5s 13.7s 7 0.8s 4.7s 8.6s 13.6s 8 1.1s 4.9s 10.5s 21.5s 9 1.8s 5.1s 10.5s 21.5s Decompression time gzip bzip2 lzmash lzmash -e 1 0.1s 0.4s 0.3s - 2 0.1s 0.6s 0.2s - 3 0.1s 0.7s 0.2s 0.2s 4 0.1s 0.8s 0.2s 0.2s 5 0.1s 0.9s 0.2s 0.2s 6 0.1s 0.9s 0.2s 0.2s 7 0.1s 0.9s 0.2s 0.2s 8 0.1s 1.0s 0.2s 0.2s 9 0.1s 1.0s 0.2s 0.2s
For some reason, "bzip2 -6" took more time than even "bzip -9". The result didn't change when the test was repeated. The extreme mode of lzmash creates a few bytes bigger files; seems that using "lzmash -e" makes compression both slower and less efficient with smaller files. Speed tables are omitted because the smaller test file makes measuring the elapsed time with 'time' command too inaccurate.
Uncompressed size: 15964160 bytes (15.2 MB)
Compressed file size in bytes gzip bzip2 lzmash lzmash -e 1 4705710 3702465 3390291 - 2 4560441 3172615 2117511 - 3 4460478 2914692 1921894 1929077 4 4213705 2748562 1803104 1808532 5 4095300 2670185 1721301 1723689 6 4060060 2591439 1642013 1643645 7 4046707 2500735 1540827 1541735 8 4035433 2464688 1533283 1531514 9 4034855 2418265 1533283 1531514 Compressed size / Uncompressed size * 100% gzip bzip2 lzmash lzmash -e 1 29,5% 23,2% 21,2% - 2 28,6% 19,9% 13,3% - 3 27,9% 18,3% 12,0% 12,1% 4 26,4% 17,2% 11,3% 11,3% 5 25,7% 16,7% 10,8% 10,8% 6 25,4% 16,2% 10,3% 10,3% 7 25,3% 15,7% 9,7% 9,7% 8 25,3% 15,4% 9,6% 9,6% 9 25,3% 15,1% 9,6% 9,6% Compression time gzip bzip2 lzmash lzmash -e 1 0.7s 6.1s 3.5s - 2 0.7s 7.3s 6.0s - 3 0.8s 8.5s 19.0s 30.8s 4 0.9s 9.9s 19.9s 31.2s 5 1.1s 11.2s 28.9s 1m 1s 6 1.4s 11.0s 30.1s 1m 2s 7 1.7s 12.5s 30.9s 1m 4s 8 2.5s 15.9s 41.7s 1m 56s 9 2.9s 17.5s 41.7s 1m 56s Decompression time gzip bzip2 lzmash lzmash -e 1 0.2s 1.0s 0.6s - 2 0.2s 1.5s 0.4s - 3 0.2s 1.9s 0.4s 0.4s 4 0.2s 2.1s 0.4s 0.4s 5 0.2s 2.3s 0.4s 0.4s 6 0.2s 2.5s 0.4s 0.4s 7 0.2s 2.6s 0.4s 0.4s 8 0.2s 2.7s 0.4s 0.4s 9 0.2s 2.8s 0.4s 0.4s
For some reason, in compression "bzip2 -6" was a little faster than "bzip -5" but "bzip -6" still created smaller file. Speed tables are omitted because the smaller test file makes measuring the elapsed time with 'time' command too inaccurate.
The memory requirements depend only on the used compression mode (-1 .. -9). bzip2 has also a mode that uses less memory but is slower. This small memory mode hasn't been tested.
RAM usage on compression gzip bzip2 lzmash lzmash -e 1 <1 MB 2 MB 2 MB - 2 <1 MB 2 MB 12 MB - 3 <1 MB 3 MB 12 MB 12 MB 4 <1 MB 4 MB 16 MB 16 MB 5 <1 MB 5 MB 26 MB 26 MB 6 <1 MB 5 MB 45 MB 45 MB 7 <1 MB 6 MB 83 MB 83 MB 8 <1 MB 7 MB 159 MB 159 MB 9 <1 MB 7 MB 311 MB 311 MB RAM usage on decompression gzip bzip2 lzmash lzmash -e 1 <1 MB 1 MB 1 MB - 2 <1 MB 2 MB 2 MB - 3 <1 MB 2 MB 1 MB 1 MB 4 <1 MB 2 MB 2 MB 2 MB 5 <1 MB 3 MB 3 MB 3 MB 6 <1 MB 3 MB 5 MB 5 MB 7 <1 MB 3 MB 9 MB 9 MB 8 <1 MB 4 MB 17 MB 17 MB 9 <1 MB 4 MB 33 MB 33 MB
When there's need for a very fast compression, gzip is the clear winner. It has also very small memory footprint, making it ideal for systems with limited memory.
bzip2 creates about 15% smaller files than gzip. bzip2 compresses somewhat slower than gzip, but seems that it hasn't prevented bzip2 from getting popular. Nowadays most source code is available as both gzip and bzip2 compressed tar archives.
"lzmash -3" and "lzmash -4" seem to be almost as fast (or slow); same can be said for "lzmash -5", "lzmash -6" and "lzmash -7". However the memory requirements increase with every option meaning that "lzmash -3", "lzmash -5" and "lzmash -6" are usually useful only if you (or the recipient) do not have enough memory for "lzmash -4" or "lzmash -7".
"lzmash -8" and "lzmash -9" require lots of memory and are practical only on newer computers; the files compressed with them are probably a pain to decompress on systems with less than 32 MB or 64 MB of memory.
The extreme mode ("lzmash -e") roughly doubles the compression time, but especially with small files can lead to even worse compression ratio than normal the mode. The extereme mode might be worth trying if you want make as small files as possible, but in that case forgetting lzmash wrapper script and playing with command line options of "lzma" directly can lead to better results.
In terms of speed, gzip is the winner again. lzma comes right behind it two to three times slower than gzip. bzip2 is a lot slower taking usually two to six times more time than lzma, that is, four to twelve times more than gzip. One interesting thing is that gzip and lzma decompress the faster the smaller the compressed size is, while bzip2 gets slower when the compression ratio gets better.
The memory usage of lzma stays competitive with bzip2 when files have been compressed with "lzmash -6" or with a smaller option. The files compressed with the default "lzmash -7" can still be decompressed, even on machines with only 16 MB of RAM, but sometimes you don't have even that much memory available. If you compress with "lzmash -8" or "lzmash -9", you should think if the users need to be able to decompress your files also on "ancient" computers.
Of course, it depends on the intended application. gzip is very fast and has small memory footprint. According to this benchmark, neither bzip2 nor lzma can compete with gzip in terms of speed or memory usage. bzip2 has notably better compression ratio than gzip, which has to be the reason for the popularity of bzip2; it is slower than gzip especially in decompression and uses more memory. However the memory requirements of bzip2 should be nowadays no problem even on older hardware.
Both gzip and bzip2 are bundled with practically all GNU/*/Linux distributions and *BSDs. Because everybody has the tools to handle gzip and bzip2 compressed files, they are by far the most commonly used formats to distribute e.g. source code of free software. However, the situation might change because better free (as in freedom) alternatives have become available.
LZMA clearly has potential to become the third commonly used general purporse compression format on *NIX systems. It mainly competes with bzip2 by offering significantly better compression ratio while still keeping decompressing speed relatively close to that of gzip. Its excellence has been already seen in Tukaani Linux package management system, and in software installers such as Nullsoft Scriptable Install System (NSIS), Inno setup and installers of MS-Windows versions of Mozilla products, including Firefox and Thunderbird.