The new NVIDIA Grace Superchip is one of our most highly anticipated new products of 2023. At SC22, I told an unnamed NVIDIA executive that I was excited about it but wanted to see it, and this individual let me know they had chips working. That was November 2022, so now in 2023, we are starting to hear more about Grace, NVIDIA’s push into completing its CPU-GPU-Interconnect trifecta.
NVIDIA Grace Superchip Features 144 Cores, 960GB of RAM, and 128 PCIe Gen5 Lanes
The NVIDIA Grace line will have the ability to have both the NVIDIA CPU and GPU on the same package, but NVIDIA is aiming beyond that. Armed with Arm Neoverse-V2 cores, the NVIDIA Grace Superchip has two 72 cores, integrated memory, and a NVLINK C2C interconnect between them. Here is the diagram:
Each Arm Neoverse-V2 core gets:
- 64 KB I-cache + 64 KB D-cache per core
- L2: 1 MB per core
- L3: 234 MB per superchip
Since NVIDIA is saying “234MB per superchip” as its L3 spec, we assume that is for the full two Grace CPU chip putting L3 cache at 1.625 MB L3 cache/ core.
Onboard each Grace CPU gets 500GB/s of LPDDR5X memory bandwidth. Since the spec is now up to 960GB of LPDDR5X per Superchip, that is 480GB per Grace CPU. NVIDIA will have the ability to vary this capacity.
NVIDIA is also saying that it has 8x PCIe Gen5 x16 roots in its chip putting it around the same as a single AMD EPYC 9004 processor.
The company has started to share benchmarks of the new processor beyond the 7.1 TFLOP FP64 peak performance:
This was noted as being relative to a dual-socket AMD EPYC 7763 processor, so NVIDIA is comparing its parts to previous-gen AMD. If you want to learn more about the new AMD EPYC Genoa, see that article or our video:
If you want to learn more about last week’s Intel Xeon Sapphire Rapids, here is the video for that.
NVIDIA’s Grace Superchip is a 500W part, but that includes the LPDDR5X memory. We have been seeing roughly 5W/ DDR5 RDIMM. So for AMD EPYC 9004 Genoa with 12 channels of memory, we would add ~60W to the CPU’s TDP. That makes NVIDIA’s chip slightly more powerful, but with what should be a bit over twice the memory bandwidth than a single Genoa CPU.
That single CPU comparison will become important. The AMD EPYC Genoa more than doubled memory bandwidth, and it has much higher compute performance due to adding more cores and a faster microarchitecture. Our sense is that in HPC workloads, Grace will compete with dual-socket Genoa. On the integer side, AMD should be ahead based on what we have seen with existing Arm architectures.
Chalk this one up to excitement for a new part. NVIDIA GTC 2023 is coming up in March, and this is perhaps the part I am most excited to see. NVIDIA has not set a Grace or Grace Hopper release date, but this feels like a big enough product bringing NVIDIA into a new market that it should launch on stage at GTC, especially if we are starting to see benchmark results already.
The server world is starting to get more exciting! If you want to learn more about some of the exciting features of new servers, I recently did a video outside of Intel’s HQ with Supermicro on why this new generation of servers is very exciting.
In the meantime, expect another Ampere Arm CPU plus NVIDIA GPU piece soon as well as a BlueField demo coming on STH.