Cavium ThunderX2 Context: The Most Important Arm Data Center Release
Every organization has their own evaluation criteria for adopting a new platform. We wanted to focus on four key lenses that potential buyers evaluating the technology may use: the ecosystem, socket performance, platform performance, and the competitive landscape. There are other lenses we see folks in the industry using, but these four are going to come up in every conversation.
Although there is a lot of focus on the raw performance and general impressiveness of the ThunderX2 platform, we must acknowledge the fact that very few systems are installed in a greenfield ecosystem today. Instead, the systems are installed alongside existing applications and with competitive products in the market so there is much more than the raw performance we need to look into.
Cavium ThunderX2 Ecosystem
When we look at the Cavium ThunderX2 ecosystem, we see a much broader set of products and logos than we saw for the ThunderX launch. At launch, the Cavium ThunderX2 has a number of key customers, especially in the HPC space, for the new chip.
What you will notice about that list is that these are large HPC vendors and labs along with companies from the US, Asia, and Europe. That should give you some idea of where the chip is seeing the most traction at this point. It is also intriguing because this is the same market that Intel targeted with Xeon Scalable and AVX-512.
Looking ot the broader ecosystem, one can see that there are a large number of companies involved in the Arm ecosystem at this point. This is well beyond what we saw in 2016 showing how far this has come in the past two years.
Cavium highlighted a few OEM platforms from HPE, Atos, and Cray. All three companies spoke at the launch event in San Francisco.
Gigabyte has been a major ODM partner for Cavium since the original ThunderX generation. We are in the process of reviewing a number of Gigabyte servers and the overall build quality has improved a great deal. Gigabyte now ODM’s servers for other brands such as the HPE Cloudline 2200 and 2100 series.
Overall, the number of parties involved has increased with ThunderX2’s launch. All of these companies investing in the alternative architecture have a tangible impact: using Arm servers has become more accessible to a broader swath of the server market.
Using the Cavium ThunderX2 Ecosystem
The launch of Cavium ThunderX2 coincides with a completely different ecosystem than we had in 2016 with ThunderX. With ThunderX, the world had a dual socket Arm platform, but the software side needed a lot of work. By April 2016, with the Ubuntu 16.04 LTS release, the Arm ecosystem was improved over previous generations, but at that time we felt that we needed to publish a maturity model to explain our experience. Here is that maturity model:
During our early Arm server testing, simple tools were not available. Examples down to simple tools like iperf3 which one was accustomed to using “sudo apt install -y iperf3” on the x86 side required compiling the software in our ThunderX days. Now, installing iperf3 along with more complex packages like MariaDB/ MySQL, nginx, redis and others can happen directly from package managers.
At STH, we use Docker extensively and containers are a big deal in today’s infrastructure. aarch64 is fully supported on Ubuntu 16.04 and 18.04 now so it was easy to install and use docker as we would on an x86 system with a single exception.
Our x86 containers in some cases used base x86 layers so we had to build our infrastructure back from the original Dockerfiles. That process takes at most a few minutes.
Beyond Docker, items like KVM virtualization work as do many tools. Virtualization is a good feature, but there is at least one difference.
If you shut down a VM on an Intel Xeon E5-2600 V3 system then migrate it to Intel Xeon Scalable or AMD EPYC, it will start up and work fine. Since you are changing architectures, that means that you cannot simply boot a VM on the Cavium ThunderX2 (or other Arm/ Power systems.) Although the ecosystem has evolved on a monumental scale, this is still not the solution we are going to recommend to enterprises running VMware or Windows Server clusters and enjoying features like live migration.
On the hardware side, we installed additional NICs, a (LSI) Broadcom 9305-24i SAS3 controller and several NVMe SSDs, including Intel Optane, and they worked out of the box. This is not the same experience as we saw two years ago and is a vast improvement.
Taking a step back, this is an important point. ThunderX in 2016 was really a developer platform. If you wanted to compile software on Arm, you could do so with some work. ThunderX2 in 2018 is a completely different story. We now have an Arm ecosystem that can support open source and closed source projects at a higher level. For example, if you use nginx for your web server, it is trivial to add an aarch64 version and add ThunderX2 to your Kubernetes cluster or Swarm. What that means is that the Cavium ThunderX2 is now suitable for your broader DevOps and applications team usage, instead of being a developer novelty for many organizations.
At the same time, there is a gap between using an alternative x86 architecture such as AMD EPYC and an Arm architecture in a virtualized environment and in environments with paid, supported, and pre-compiled applications. The next hurdle Arm architectures need to clear is getting enough market share that the big software vendors see the commercial interest in supporting non-x86. At the time of this writing, an ISV deciding where to spend investment dollars has a larger TAM supporting x86 than they do Arm. It is an evolutionary process to get there and the journey is underway.