Cavium ThunderX2 Gaining Steam with 4096 Core HPE Deployment

0
Cavium ThunderX 2 In Microsoft OCP Project Olympus Server
Cavium ThunderX 2 In Microsoft OCP Project Olympus Server

Cavium ThunderX2 is a HPC oriented 64-bit ARM chip. We know from what we saw at Super Computing 2017 that one of Cavium’s major partners in the second generation product is HPE. For example, we saw Cavium ThunderX2 being used in HPE Apollo systems and HPE The Machine.

HPE Apollo Cavium ThunderX2 Deployment with 4096 Cores

Recently, EPCC announced that it is teaming up with HPE, ARM, Cavium and others as part of the Catalyst UK program to deploy an Apollo 70 system containing 4096 ThunderX2 cores. That is a massive number of cores. To estimate the number of systems that this entails, we can use 32 cores per CPU and 2 CPUs per system. Using Cavium, HPE can deliver 4096 cores in just 64 systems. This is an enormous improvement over deploying  Qualcomm Centriq 2400 and Ampere eMAG, as Cavium ThunderX2 is a proper ARM solution that supports up to dual socket deployment for real workloads.

ARM is based in the UK so seeing investment in ARM for HPC is not overly surprising. EPCC announced its goal with the new HPE Apollo 70 system:

“If ARM is to become a serious contender in the HPC world, it’s crucial that there is a fully optimised and well-tested software stack to support users and their application codes. EPCC’s focus will be on porting many of the UK’s key computational science applications – many of the applications that run on the National HPC Service, ARCHER, today – to the Apollo 70 system to explore its performance and identify how best to compile and optimise codes for this new processor. There is of course huge expertise in writing highly-optimised software for the ARM core today, but most of this experience is in mobile applications rather than numerically intensive simulation codes.”

(Source: EPCC)

We think this is important for two reasons. First, it is another deployment of a ARM HPC cluster that is more akin to standard operating models used today. Second, the goal of the research cluster is to port applications to ARM. Most of the code in this space is written for x86 architectures along with CUDA accelerators so there is a lot of porting work to be done.

Cavium ThunderX2 Background

We learned quite a bit about the Cavium ThunderX2 and OCP Platform Details in 2017 and even heard partners beyond HPE shipping servers, e.g. Cavium ThunderX2 Servers from Gigabyte and Ingrasys Shipping. Here is a diagram we saw last year showing off an 8 channel memory configuration.

Cavium ThunderX2 OCP Motherboard DIMM Layout
Cavium ThunderX2 OCP Motherboard DIMM Layout

At OCP Summit 2018, we learned more about ThunderX2 with the release of a Gigabyte workstation called the ThunderXStation.

Cavium ThunderXStation At OCP Summit 2018
Cavium ThunderXStation At OCP Summit 2018

All signs are pointing to more ThunderX2 availability and momentum in the market.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.