At ISC21 we saw a slight uptick in the number of new systems on the Top500 list. Although the relative increase in systems still leaves the market far short of pre-pandemic levels, there were certainly some interesting changes with this new edition of the list in terms of architecture. As is our custom, we are going to break down the 58 net-new systems listed on the June 2021 list since those show where the market is going.
We have a quick video version where we discuss some of the below:
As always, we suggest opening the video in a new tab or browser on YouTube for the best viewing experience.
Top500 New System CPU Architecture Trends
In this section, we simply look at CPU architecture trends by looking at what new systems enter the Top500 and the CPUs that they use. Let us start by looking at the vendor breakdown.
In June 2021, Arm took the top spot, and the A64fx added another win, but it is not adding systems at the pace of AMD and Intel. For NEC, the NEC Vector Engine is tallied as a CPU and thus included here as well. To us though, the big change is on the x86 side where there have been some massive shifts. Here is a breakdown by architecture:
In our November 2020 analysis, we noted that there were a relatively high number of older CPUs. Now, Skylake is not common and Xeon E5 / Naples are not on the list. In early 2021, the AMD EPYC 7002 series “Rome” and Intel Xeon Scalable “Cascade Lake” series CPUs were current-generation being replaced midway through the June 2021 list’s cycle with Milan and Ice Lake.
In November 2020 when we excluded pre-2018 x86 “new” systems Intel was in almost twice the number of new systems as AMD. Now, even without removing those older Skylake CPUs, AMD is slightly ahead. Although the number of systems may look similar between Intel and AMD, it is fairly clear that AMD is sold more cores into the June 2021 Top500 list’s new systems. Here is the AMD v. Intel mix:
This is a massive shift in the market. June 2021 is the Top500 list where the market officially shifted away from Intel dominance. Fewer new systems based on Intel, and more AMD cores is a large change from the previous list. Here is the November 2020 version of the above chart:
Intel focused on AVX-512 and AI instruction support heavily in its Ice Lake Xeon launch discussing its HPC dominance. At the same time, it appears as though Intel is losing in the HPC market. The actual figures are much worse than portrayed above. Lenovo has been using a tactic where it runs Linpack on portions of Chinese service provider deployments which are not traditional HPC clusters. These 10GbE systems account for 9 of Intel’s 26 systems or around 35% of Intel’s new systems on the list. They are also using Cascade Lake processors and account for around 20% of Intel’s cores.
Although we have called out this Lenovo practice for some time as a way for Lenovo to claim the #1 spot, right now it is also helping Intel remain somewhat close to AMD on the new Top500 systems.
CPU Cores Per Socket
Here is an intriguing chart, looking at the new systems and the number of cores they have per socket.
The new Intel Xeon Scalable Ice Lake generation only scales to 40 cores. There is a single Xeon Platinum 9200 series system on the list that has 48 cores. 48% of the new systems have 48 or 64 cores per socket and Intel’s mainstream CPUs do not have core counts that can reach those levels. Intel has only the previous-generation niche Platinum 9200 series to service that market.
Here are the actual SKUs used:
The AMD EPYC 7V12 is Microsoft Azure’s custom SKU. Both AWS and Microsoft Azure had entries on this list. The Azure team is building a dedicated HPC infrastructure in the cloud.
We do see a single Intel Xeon Platinum 9242 system, but not the higher-end Intel Xeon Platinum 9282 56-core part. Interestingly the AMD EPYC 7742 was the most common SKU but the AMD EPYC 7H12 we reviewed took the number 3 spot after not being in November 2020 systems.
Accelerators or Just NVIDIA?
As with many of the most recent lists, NVIDIA is the only accelerator vendor for the new systems. Here is a breakdown in our accelerator by vendor chart:
In June 2021 22 of the 58 new systems used NVIDIA accelerators (down from 28 of 58 in June 2020’s 58 new systems.) Here is a breakdown of the new accelerated systems by accelerator:
In the November 2020 Top500 list, the V100 was significantly more common than the NVIDIA A100. Now, that has flipped with only two new V100 systems being reported.
Acceleration is still a NVIDIA game, but with Exascale systems coming soon, and we know about AMD with Frontier and El Capitan 2 along with Intel Xe HPC GPUs for that era, we may see a change over the next few lists as high-end systems get more diverse with accelerators. While the AMD EPYC seems to be making inroads, the AMD Instinct MI100 was notably absent.
Fabric and Networking Trends
A really interesting change in this list is the reversal of a trend toward more Ethernet systems being added to the list in Top500 systems.
In June 2020 we saw Ethernet at 53% of the new systems. Now we have only 19% of new systems using Ethernet. In November 2021 we had
While Omni-Path had some uptake on the November 2020 list, we do not have any Intel OPA or Cornelis Networks Omni-Path systems here.
When we look at a breakdown by generation, here is what we get:
As a quick note, there were two NVIDIA DGX-based solutions that were marked as “Infiniband” but without a generation. We are using HDR here since the DGX systems usually have HDR cards. We just wanted to point that out here as it was part of the data cleanup effort but it may have been done incorrectly.
If we drill into which manufacturers are using 10GbE, 25GbE, and 40GbE to be consistent with our previous analysis, here is what we get:
As one can see, Lenovo maintains 10GbE supremacy. Lenovo’s 10GbE systems are Chinese service provider installations that are benchmarked with Linpack and added to this list. Many of these systems are installed at “Service Provider T” and “Service Provider K” in China and do not use accelerators. Somewhat interesting is that we do not see any 100GbE systems here.
When we look at the vendor picture, we can get a sense of what is happening in the market:
Lenovo is #1 again, but the majority of its new systems are the Chinese service provider systems. Without those, it would be behind Atos and HPE on this list, and again Intel would look very meek. Notably absent is Inspur from this list. Also, we know Supermicro is selling clusters as we saw in Tesla Supercomputer with NVIDIA A100 80GB and Supermicro shown but those are not being submitted to this list.
Even though we know the list is not a perfect representation of what is going on, it is still fun to do a bit of analysis around the changes happening in the industry.
It’s possible the HPL benchmark is finally too boring for systems used in engineering and science such as Tesla’s supercomputer. Another aspect of this is the fact that HPL is trivial enough it can be scammed using cloud instances deployed with 10 GB Ethernet. This leads to political numbers like those from Lenovo that are primarily intended for advertising purposes.
Is it true there are real exascale systems now running in various countries not on the top500?
That Fujitsu contraption seems to have some huge advantages, namely simple ARM SVE SIMD programming (as opposed to shuffling things to/from less flexible GPUs), a simpler topology, and a fast interconnect.
And it STILL hits the top spot, which is really just a measure of raw power… that’s rather impressive.
Now picture something next gen. Fat 2048-bit SVE2 units, 6 HMB stacks, tiles… if I were AMD/Intel, I would be very, very worried about future ARM-based installations.
Interesting reading. But one remark: you describe it as Intel’s armagedon, but it’s more about historic data as I see it from your second graph. When we look at current archs, we can see there ICL deployed 2x of Zen3. To me it looks like whole Intel’s PCIE-3.0 support only was showstopper for HPC from some time hence migration to AMD but now, when even Intel supports PCIE-4.0 the trend seems to go in reverse direction again. Also I would guess AVX-512 “virus” may hold something here.
EJ, indeed, what Fujitsu did with Arm is truly remarkable. Although it’s just their update and reiteration of what was in the past done by them using SPARC64 fx cores.
Love these articles.
The 7V12 is a 64-core SKU 😉
@KarelG – whether it is v3 or v4 is secondary to the total amount of io lanes. That’s where amd wins, double rates are added bonus. Zen 3 is still ramping up, whereas intel happened to still score a few major contracts. Ultimately, performance stands for one of the letters in HPC, which I am willing to assume makes it kinda important, and intel’s cores are still weaker, fewer, more power hungry and more expensive, plus the inferior io. Intel is still years from being competitive, for now their lag behind is indefinite, and the only card they can play now is becoming the value option – not by lowering their prices as much as amd’s growing business giving it confidence to boost its own margins higher.
Intel went from laughing at amd outsourcing their primary product manufacturing to outsourcing theirs, even if only to steal manufacturing capacity from amd. Intel went from effectively paying off oem’s to not sell amd to effectively paying off foundries to not make amd… Because it literally doesn’t have anything with which it can compete constructively, all it can is be anti-competitive.