The “Gotchas” of QAT
Performance is very good with Intel QAT. That is frankly what we would expect with any acceleration technology, even AI accelerators in that domain. All of this is not without a few “gotchas”.
- QAT is not static. Over time, there are new ciphers added as an example. That means, a cipher you want to use, may require a specific version of QAT. WireGuard has become popular and its default bulk encryption transform is ChaCha20/Poly1305. That was not supported by QAT until what we call the QAT v3 generation. WireGuard has become massively popular, but the QAT hardware offload support for the main cipher folks use did not come until recent hardware. So if you have a remote endpoint on an Atom C3000 as an example, it is a QAT platform that is supported for several more years, but it does not have the correct cipher support.
- QAT still requires enablement. Today, we think of things like AES-NI as being “free” and they “just work” with most software. QAT is a technology we first started seeing at STH in 2013, but it is also one that is nowhere near transparent today. Work needs to be done and it is still very much an Intel technology.
- If you do very little encryption/ decryption or compression/ decompression then you will not get a large benefit from this. For example, if you are running computational fluid dynamics farms or rendering farms, then this is going to have a relatively minimal impact on your workloads.
These are actually quite important for the overall discussion. The more Intel pushes QAT into its product line, we would expect to see better product enablement. At the same time, that enablement has been slow thus far. The caveat is that the adoption is actually fairly good in the markets that use a lot of crypto/ compression acceleration. Moving beyond today’s adoption levels requires a step function in accessibility.
So after a lot of testing, I walked away with several key takeaways:
- Intel QAT hardware acceleration offers a huge boost to performance. Put another way, it can either increase the capacity of the system to do crypto/ compression in some cases, or simply free cores to do other tasks in others.
- The Intel QAT Engine is a software platform that is severly under-hyped. We saw huge performance gains just using the QAT Engine with the software acceleration side and not using the hardware accelerator. This is something Ice Lake platforms can use without hardware QAT accelerators so it is “free” performance that is often not discussed.
- This is a major thrust of Intel going forward, but it is going to take software adoption to utilize it. We recently covered More Cores More Better AMD Arm and Intel Server CPUs in 2022-2023 and Intel Accelerates Messaging on Acceleration Ahead of Sapphire Rapids Xeon. QAT is an area that Intel is expanding on, so it is worth looking at if your software can utilize it.
As folks know, at STH, we have been around since QAT Gen1 was launched and were very early showing the technology when we tried it in 2016. It has come a long way since then. Part of that is the adoption by companies building infrastructure and storage. There is still room for that to expand further.
I know that many people want many ciphers tested for IPsec and the TLS handshakes along with different payloads and so forth. Putting this together, and these are fairly well-known use cases, took many days, and even just running through the tests and doing final validation checks for this took about a day. This is a massive effort.
I just wanted to say thank you to the Intel team for their support when I said I wanted to do this one. For companies that want to build an IPsec gateway, or an enterprise storage platform, a few days to get an accelerator to work or use new libraries is not a huge effort. For STH going across disciplines and showing all of this takes a lot.
As you may have seen from the photos in the test setup, we did run through all of this not just with the dual-socket Ice Lake-SP Xeon platform with dedicated cards. We also ran through everything with embedded parts as well. The story arc I wanted to show folks is looking at Intel QAT as an add-in PCIe accelerator, then as integrated into embedded parts. The reason for this is clearly looking at the future of where Intel is going with its roadmap. Stay tuned for the embedded piece in a few weeks as we ramp up our Ice Lake-D series on STH.