Zstd

Lossless compression algorithm From Wikipedia, the free encyclopedia

Zstandard is a lossless data compression algorithm developed by Yann Collet at Facebook. Zstd is the corresponding reference implementation in C, released as open-source software on 31 August 2016.[3][4]

Original authorYann Collet
DevelopersYann Collet, Nick Terrell, Przemysław Skibiński[1]
Initial release23 January 2015 (2015-01-23)
Stable release
1.5.7[2] Edit this on Wikidata / 20 February 2025; 12 months ago (20 February 2025)
Quick facts Zstandard, Original author ...
Zstandard
Original authorYann Collet
DevelopersYann Collet, Nick Terrell, Przemysław Skibiński[1]
Initial release23 January 2015 (2015-01-23)
Stable release
1.5.7[2] Edit this on Wikidata / 20 February 2025; 12 months ago (20 February 2025)
Written inC
Operating systemCross-platform
PlatformPortable
TypeData compression
LicenseBSD-3-Clause or GPL-2.0-or-later (dual-licensed)
Websitefacebook.github.io/zstd/ Edit this on Wikidata
Repositorygithub.com/facebook/zstd
Close

The algorithm was published in 2018 as RFC 8478, which also defines an associated media type "application/zstd", filename extension "zst", and HTTP content encoding "zstd".[5]

Features

Zstandard was designed to give a compression ratio comparable to that of the DEFLATE algorithm (developed in 1991 and used in the original ZIP and gzip programs), but faster, especially for decompression. It is tunable with compression levels ranging from negative 7 (fastest)[6] to 22 (slowest in compression speed, but best compression ratio).

Starting from version 1.3.2 (October 2017), zstd optionally implements very-long-range search and deduplication (--long, 128 MiB window) similar to rzip or lrzip.[7]

Compression speed can vary by a factor of 20 or more between the fastest and slowest levels, while decompression is uniformly fast, varying by less than 20% between the fastest and slowest levels.[8] The Zstandard command-line has an "adaptive" (--adapt) mode that varies compression level depending on I/O conditions, mainly how fast it can write the output.

Zstandard at its maximum compression level gives a compression ratio close to lzma, lzham, and ppmx.[9][10] As of 2019, Zstandard reaches the Pareto frontier as it decompresses faster than any other open source algorithm with a similar or better compression ratio.[11]

Dictionaries can have a large impact on the compression ratio of small files, so Zstandard can use a user-provided compression dictionary. It also offers a training mode, able to generate a dictionary from a set of samples.[12][13] In particular, one dictionary can be loaded to process large sets of files with redundancy between files, but not necessarily within each file, such as for log files.

Design

Zstandard combines a dictionary-matching stage (LZ77) with a large search window and a fast entropy-coding stage. It uses Huffman coding alongside finite-state entropy (FSE), a variant of tANS.[14]

Usage

Quick facts Zstandard, Filename extension ...
Zstandard
Filename extension
.zst[15]
Internet media type
application/zstd[15]
Magic number28 B5 2F FD[15]
Type of formatData compression
StandardRFC 8878
Websitegithub.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md
Close
Quick facts Zstandard Dictionary, Magic number ...
Close
Here is an example of Zstandard being used as compression method for zram

The Linux kernel has included Zstandard since November 2017 (version 4.14) as a compression method for the btrfs and squashfs filesystems,[16][17][18] as well as for loadable kernel modules.

In 2017, Allan Jude integrated Zstandard into the FreeBSD kernel,[19] and it was subsequently integrated as a compressor option for core dumps (both user programs and kernel panics). It was also used to create a proof-of-concept OpenZFS compression method[8] which was integrated in 2020.[20]

The AWS Redshift and RocksDB databases include support for field compression using Zstandard.[21]

In March 2018, Canonical tested[22] the use of zstd as a deb package compression method by default for the Ubuntu Linux distribution. Compared with xz compression of deb packages, zstd at level 19 decompresses significantly faster, but at the cost of 6% larger package files. Support was added to Debian (and subsequently, Ubuntu) in April 2018 (in version 1.6~rc1).[23][22][24]

Fedora added ZStandard support to RPM in May 2018 (Fedora release 28) and used it for packaging the release in October 2019 (Fedora 31).[25] In Fedora 33, the filesystem is compressed by default with zstd.[26][27]

Arch Linux added support for zstd as a package compression method in October 2019 with the release of the pacman 5.2 package manager[28] and in January 2020 switched from xz to zstd for the packages in the official repository. Arch uses zstd -c -T0 --ultra -20 -; the size of all compressed packages combined increased by 0.8% (compared to xz), the decompression speed is 14 times faster, decompression memory increased by 50 MiB when using multiple threads, and compression memory increased but scales with the number of threads used.[29][30][31] Arch Linux later also switched to zstd as the default compression algorithm for mkinitcpio initial ramdisk generator.[32]

On 15 June 2020, Zstandard was implemented in version 6.3.8 of the zip file format with codec number 93, deprecating the previous codec number of 20 as it was implemented in version 6.3.7, released on 1 June.[33][34]

On 31 October 2023 Official Zstd support for compression/decompression was added to Windows Explorer in Windows 11 (via update package KB5031455)

In March 2024, Google Chrome version 123 (and Chromium-based browsers such as Brave or Microsoft Edge) added zstd support in the HTTP header Content-Encoding.[35] In May 2024, Firefox release 126.0 added zstd support in the HTTP header Content-Encoding.[36]

License

The reference implementation is licensed under the BSD license, published at GitHub.[37] Since version 1.0, published 31 August 2016,[38] it had an additional Grant of Patent Rights.[39]

From version 1.3.1, released 20 August 2017,[40] this patent grant was dropped and the license was changed to a BSD + GPLv2 dual license.[41]

See also

  • LZ4 (compression algorithm) – a fast member of the LZ77 family
  • LZFSE – a similar algorithm by Apple used since iOS 9 and OS X 10.11 and made open source on 1 June 2016
  • Zlib
  • Brotli – also integrated into browsers
  • Gzip – one of the most widely used compression tools

References

Related Articles

Wikiwand AI