An absolutely killer feature of ZFS is the ability to add compression with little hassle. As we turn into 2018, there is an obvious new year’s resolution: use ZFS compression. Combined with sparse volumes (ZFS thin provisioning) this is a must-do option to get more performance and better disk space utilization. Many workloads work really well with ZFS compression. Overall, this is something we see far too many users overlook when it can be an enormous benefit.
How to find if you have ZFS compression enabled
For many ZFS environments, lz4 compression is the go-to solution. It is fast and gives a decent amount of benefit. Look at it as the minimal tradeoff to get substantial gains. While there are some ZFS environments that default to lz4 compression, most will not have compression enabled by default. We looked back on questions we received in 2017, and a common one was how to find if you have ZFS compression enabled.
There are many ways you can do this, but the easiest is with “zfs get compression.” Here is an example of using that command:
Along with “zfs get compression” another useful command is “zfs get compressratio” which shows the compression ratio. Note it is “compress” not “compression” ratio here. These are two attributes that you will see on zpools/ zvols. You can also see that we have the zvol’s (p3600R1/vm-203-disk-1) compression inherited from the zpool (p3600R1.) We suggest setting compression at the zpool ratio for general purpose storage and then alter explicitly from there.
ZFS has a lot of attribute information that you can use “zfs get all” to lookup. Here is an example:
If you have zfs compression showing as “on”, and want to see if you are using lz4 already, then you can do a zpool get all and look for/ grep feature@lz4_compress which should be active if you are using lz4 as the default:
Either way, our resolution is to turn on zfs compression if at all possible.
How To Set ZFS Compression
We are going to suggest simply setting ZFS compression at the zpool level. That allows subsequent datasets to inherit compression making it easy to maintain. To set the compression to lz4, we can use “zfs set compression=lz4”. Here is an example:
In the first zfs get compression command we see that compression is off by default. We use zfs set compression=lz4 on our zpool (bulksata2) to turn compression on. We then verify that the compression is now set to lz4.
You will notice that the compression ratio is 1.00x which is essentially nothing. That is simply because we have not copied any data to the new volume.
You can also use different algorithms such as gzip (e.g. gzip-7 that we are using in our next example.) If you have a pool that is for incremental backups on a dedicated backup server, going the gzip route can make a lot of sense to save space if you do not want to turn on deduplication.
ZFS Compression Impact
We took a 40GB Ubuntu 16.04.3 LTS VM volume (about 32GB of 40GB in-use) used for image caching and cloned a snapshot to “bulk storage”. For us, that means the source was on two Intel Optane 900p 280GB NVMe drives and the destination was two Samsung PM963 960GB SSDs. These SSDs were configured as ZFS mirrors. Given the size of the VM, we wanted to create a bottleneck at the destination while ensuring the source was many times faster than the destination.
First off, we wanted to see compression ratios. We know that compression=off gives us 1.00x compression since it is not compressed. Here is what we saw with the lz4 compressed pool:
We got a 1.93x compression ratio with lz4 compression. That is good for near transparent performance. We took the exact same volume and used gzip-7 just to show the compression ratio difference:
As you can see, we got a 2.27x compression ratio which is significantly better. We used gzip-7 since it is biased towards higher compression ratios versus lower levels offered by gzip that are faster.
There was a cost, however.
With lz4 compression, the entire snapshot clone operation (NVMe to SATA) took about 10% of the dual Intel Xeon E5-2698 V4 system’s CPU which was already running at 52% utilization. Using gzip-7 we saw utilization spike over 16% during the operation. As an additional data point, this copy with compression=off pushed utilization up 8-9%. Here are the incremental average CPU utilization of these three operations:
In terms of the actual clone performance, the timings were close but there was a noticeable difference between these three options:
Not only did lz4 use less CPU, but it did so over a shorter period of time.
We also were logging iowait while we were doing these operations. Since we were using a system with a ~52% base CPU load, which is more akin to a running virtualization server, we wanted to see what the impact was on iowait since that is usually a parameter we want to be minimized.
If you remember, we are doing this transfer from mirrored Intel Optane SSDs to mirrored Samsung SATA3 SSDs much as you would do if you were cloning development VMs to lower-cost storage. We were debugging the impact of that cloning operation at the system level which is why we were watching the iowait number. Through the ten runs we did, each time we had fairly consistent max iowait on a system which was near 0% on the system otherwise.
To some compression=off may seem like the obvious choice for the highest performance, it is not. While we would prefer to use gzip for better compression, lz4 provides “good enough” compression ratios at relatively lower performance impacts making it our current recommendation.
If you are starting the year and looking for a project, ensure that your ZFS storage is using lz4 compression. The fact that lz4 is a good compromise between compression ratio and performance is well-known at this point. Using lz4 can provide both a performance and space-saving benefit during some operations which makes it a great choice.
Even though lz4 ZFS compression is a well-known solution, over the 2017 holiday season we logged into no less than half a dozen servers from other folks who were not using it on their zpools. Set compression=lz4 at the zpool level and allow data sets to inherit the compression. You will be happy for this new year’s resolution that takes a few seconds and has tangible benefits.
Test Configuration Notes
Here is a quick overview of the test configuration for the above. We are becoming ZFS on Linux fans so we are using ZFS 0.7.2 on Debian.
- System: Supermicro 2U Ultra
- CPUs: 2x Intel Xeon E5-2698 V4
- RAM: 256GB (8x 32GB) DDR4-2400 RDIMMS
- OS SSDs: ZFS Mirror Intel DC S3610 480GB
- Source SSD: ZFS Mirror Intel Optane 900p 280GB
- Destination SSDs: ZFS Mirror Samsung PM963 960GB
- Proxmox VE 5.1
- Based on Debian Stretch 9.2
- Kernel 4.13.3
- QEMU 2.9.1
- ZFS 0.7.2