Add a section about Zstandard compression to the ZFS handbook

Reviewed by:	emaste, ygy, bcr, debdrup, pauamma@gundo.com
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D27715
This commit is contained in:
Allan Jude 2020-12-22 03:23:36 +00:00
parent e9a6e6f104
commit b5882a38ef

View file

@ -2926,6 +2926,72 @@ mypool/compressed_dataset logicalused 496G -</screen>
However since quotas do not consider compression, more data However since quotas do not consider compression, more data
may be written than would fit with uncompressed may be written than would fit with uncompressed
backups.</para> backups.</para>
<sect3 xml:id="zfs-zfs-compression-zstd">
<title>Zstandard Compression</title>
<para>In <acronym>OpenZFS</acronym> 2.0, a new compression
algorithm was added. Zstandard (<acronym>Zstd</acronym>)
offers higher compression ratios than the default
<acronym>LZ4</acronym> while offering much greater speeds
than the alternative, <acronym>gzip</acronym>.
<acronym>OpenZFS</acronym> 2.0 is available starting with
&os;&nbsp;12.1-RELEASE via
<package>sysutils/openzfs</package> and has been the
default in &os;&nbsp;13-CURRENT since September 2020, and
will by in &os;&nbsp;13.0-RELEASE.</para>
<para><acronym>Zstd</acronym> provides a large selection of
compression levels, providing fine-grained control over
performance versus compression ratio. One of the main
advantages of <acronym>Zstd</acronym> is that the
decompression speed is independent of the compression
level. For data that is written once but read many times,
<acronym>Zstd</acronym> allows the use of the highest
compression levels without a read performance
penalty.</para>
<para>Even when data is updated frequently, there are often
performance gains that come from enabling compression. One
of the biggest advantages comes from the compressed ARC
feature. <acronym>ZFS</acronym>'s Adaptive Replacement
Cache (<acronym>ARC</acronym>) caches the compressed version
of the data in <acronym>RAM</acronym>, decompressing it each
time it is needed. This allows the same amount of
<acronym>RAM</acronym> to store more data and metadata,
increasing the cache hit ratio.</para>
<para><acronym>ZFS</acronym> offers 19 levels of
<acronym>Zstd</acronym> compression, each offering
incrementally more space savings in exchange for slower
compression. The default level is
<literal>zstd-3</literal> and offers greater compression
than <acronym>LZ4</acronym> without being significantly
slower. Levels above 10 require significant amounts of
memory to compress each block, so they are discouraged on
systems with less than 16&nbsp;GB of <acronym>RAM</acronym>.
<acronym>ZFS</acronym> also implements a selection of the
<acronym>Zstd</acronym> <emphasis>fast</emphasis> levels,
which get correspondingly faster but offer lower
compression ratios. <acronym>ZFS</acronym> supports
<literal>zstd-fast-1</literal> through
<literal>zstd-fast-10</literal>,
<literal>zstd-fast-20</literal> through
<literal>zstd-fast-100</literal> in increments of 10, and
finally <literal>zstd-fast-500</literal> and
<literal>zstd-fast-1000</literal> which provide minimal
compression, but offer very high performance.</para>
<para>If ZFS is not able to allocate the required memory to
compress a block with <acronym>Zstd</acronym>, it will fall
back to storing the block uncompressed. This is unlikely
to happen outside of the highest levels of
<acronym>Zstd<acronym> on systems that are memory
constrained. The sysctl
<literal>kstat.zfs.misc.zstd.compress_alloc_fail</literal>
counts how many times this has occurred since the
<acronym>ZFS</acronym> module was loaded.</para>
</sect3>
</sect2> </sect2>
<sect2 xml:id="zfs-zfs-deduplication"> <sect2 xml:id="zfs-zfs-deduplication">