Inspur NF5488M5 Review A Unique 8x NVIDIA Tesla V100 Server

4

Inspur NF5488M5 GPU Baseboard Assembly

The GPU baseboard assembly slides out from the main 4U server chassis.

Inspur NF5488M5 Front IO And Storage With GPU Tray Partially Out
Inspur NF5488M5 Front IO And Storage With GPU Tray Partially Out

It actually has its own cover, and own sliding rail system internally. There are even side latches to keep the entire assembly secure. In effect, this is like a smaller version of a server rail kit, just found inside this single-node 4U server.

Inspur NF5488M5 HGX 2 Tray Rails And Release
Inspur NF5488M5 HGX 2 Tray Rails And Release

Taking the cover off, we can see the large hard airflow guide that runs through this section. Airflow is a key design consideration in this chassis, and therefore this is a very heavy duty airflow guide.

Inspur NF5488M5 HGX 2 Tray Airflow Guide
Inspur NF5488M5 HGX 2 Tray Airflow Guide

Removing that cover, let us work our way through the GPU baseboard. PCIe passes from the CPUs, to the motherboard, to PCIe cables, then to those Broadcom PEX9797 PCIe switches, then through the high-density PCIe connectors and then to the GPU baseboard where it is distributed to each GPU.

Inspur NF5488M5 HGX 2 Board PCIe Side Heatsink
Inspur NF5488M5 HGX 2 Board PCIe Side Heatsink

There are a total of eight NVIDIA Tesla V100 32GB SXM3 in our system ready for “Volta Next” GPUs in this system. SXM3 GPUs like this are designed to run in this 54VDC system and have 350-400W TDP. Our test system had caps set for 350W and we saw idle on each SXM3 GPU of around 50W as measured by nvidia-smi.

Inspur NF5488M5 Nvidia Smi
Inspur NF5488M5 Nvidia Smi

That is higher than the PCIe, and SXM2 versions of the Tesla V100. While all are called “Tesla V100” GPUs, there is a significant gap in capabilities.

Inspur NF5488M5 HGX 2 Board PCIe Side
Inspur NF5488M5 HGX 2 Board PCIe Side

Each GPU has its own heatsink covered in an NVIDIA shroud. The whole assembly looks very impressive.

NVIDIA HGX 2 GPU Tray Coolers On Tesla V100 SXM3 GPUs
NVIDIA HGX 2 GPU Tray Coolers On Tesla V100 SXM3 GPUs

The other key feature of the Inspur Systems NF5488M5 is the interconnect on this board. Years ago, NVIDIA innovated well beyond simply using PCIe for inter-GPU communication. With the Pascal (Tesla P100) generation, NVIDIA introduced NVLink in the SXM2 modules. We actually have a guide on How to Install NVIDIA Tesla SXM2 GPUs using Tesla P100’s. SXM2 systems generally rely on direct attach GPU-to-GPU topologies which limits their scale. The NF5488M5 is a SXM3 system with NVswitch. At STH, we coveredĀ NVIDIA NVSwitch details during Hot Chips 30 when the company went into detail around how they work.

Inspur NF5488M5 HGX 2 Board NVSwitch Heatsink Right
Inspur NF5488M5 HGX 2 Board NVSwitch Heatsink Right

There are a total of six NVSwitches on the GPU PCB. By connecting GPUs into this switched fabric, NVIDIA can provide full 300GB/s bandwidth from one GPU to another. With eight GPUs making memory transactions over NVLink, that effectively turns this into a large GPU set with 256GB of HBM2.

Inspur NF5488M5 HGX 2 Board NVSwitch Heatsink
Inspur NF5488M5 HGX 2 Board NVSwitch Heatsink

These NVSwitch modules require their own heat pipe coolers which you can see in these photos. In the Inspur NG5488M5, they are not being used to their full 16/18 port capacity (2 reserved in the NVSwitch design.)

Inspur NF5488M5 HGX 2 Board NVSwitch Bridge Connectors
Inspur NF5488M5 HGX 2 Board NVSwitch Bridge Connectors

One may notice the large high-density connectors on the right side of the photo above. These are facing out towards the front of the chassis and are not being used here. By doing some investigation, we found out why. Looking into the forest of GPUs, we found a NVIDIA logo also screened on the GPU baseboard PCB.

NVIDIA HGX 2 SXM3 Board PCB NVIDIA Logo
NVIDIA HGX 2 SXM3 Board PCB NVIDIA Logo

We also found this label. The GPU baseboard is actually an NVIDIA HGX-2 baseboard. While NVIDIA sells its DGX-2 16-GPU machine, partners such as Inspur and others have their takes on the partner-oriented NVIDIA HGX-2. The Inspur 16-GPU offering they call theĀ Inspur AGX-5. NVIDIA can bundle the HGX-2 baseboard along with the GPUs and NVSwitches for partners who can then innovate around that platform. While most have used the HGX-2 to provide DGX-2 alternatives with sixteen GPUs, the NF5488M5 is something different on the market with a single HGX-2 baseboard.

NVIDIA HGX 2 SXM3 Board PCB NVIDIA HGX 2 PN
NVIDIA HGX 2 SXM3 Board PCB NVIDIA HGX 2 PN

Those high-density connectors we see in the front of the board are designed for bridges that extend the NVSwitch fabric between two HGX-2 baseboards in the sixteen GPU designs. This is very innovative making a system with only a single HGX-2 baseboard as the HGX-2 is too dense for many data center rack environments.

Next, we are going to look at some final chassis bits and show the system topologies which are important in a server like this.

4 COMMENTS

  1. That’s a nice (and I bet really expensive) server for AI workloads!
    The idle consumption, as shown in the nvidia-smi terminal, of the V100s is a bit higher than what I’d have expected. It seems weird that the cards stay at the p0 power state (max freq.). In my experience (which is not with v100s, to be fair), just about half a minute after setting the persistence mode to on, the power state reaches p8 and the consumption is way lower (~10W). It may very well be the default power management for these cards, IDK. I don’t think that’s a concern for any purchaser of that server, though, since I don’t think they will keep it idling for just a second…

  2. Thank you for the great review Patrick! Is there any chance that you’d at some point be able to test some non-standard AI accelerators such as Groq’s tensor streaming processor, Habana’s Gaudi etc. in the same fashion?

  3. What’re the advantages (if any) of this Inspur server vs Supermicro 9029GP-TNVRT (which is expandable to 16GPU and even then cost under 250K$ fully configured – and price is <150K$ with 8 V100 32GB SXM3 GPUs, RAM, NVMe etc)?
    While usually 4U is much better than 10U I don't think it's really important in this case.

  4. Igor – different companies supporting so you would next look to the software and services portfolio beyond the box itself. You are right that this would be 8 GPU in 4U while you are discussing 8 GPU in 10U for the Supermicro half-configured box. Inspur’s alternative to the Supermicro 9029GP-TNVRT is the 16x GPU AGX-5 which fits in 8U if you wanted 16x GPUs in a HGX-2 platform in a denser configuration.

    L.P. – hopefully, that will start late this year.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.