Supermicro SYS-821GE-TNHR Topology
The big component that is not on a tray inside the chassis seems to be the midplane. Here is a look through the system from the front. The top portion if for the GPU tray, and the bottom is for the CPU tray.

Here is a quick look at the midplane as the NVIDIA HGX H200 8-GPU tray would see it sliding in.

If you are wondering, the top of the midplane has handles, so if you have to remove the midplane at least it is on a track and has handles to help.

Here is the CPU side of the midplane. You can see the NICs on the other side.

This midplane avoids some cabling but also is a key to making the system very easy to service.
Supermicro SYS-821GE-TNHR Topology
Supermicro makes custom motherboards for its AI servers, something that only some vendors do. For example, the Supermicro X13DEG-OAD that is in this system is really designed to be installed in this server, not even theĀ Supermicro 4U Universal GPU System for Liquid Cooled NVIDIA HGX H100 and HGX H200 platform. As a result, the focus here is on providing MCIO PCIe connectivity to the PCIe switch architecture.

When we refer to modern NVIDIA HGX 8-GPU platforms, we often refer to NICs that below to the CPU side or the GPU side. This block diagram explains that in quite great detail. We can see the PCH with the two M.2 SSDs and the ASPEED BMC on the CPU side, but all of the PCIe lanes from the CPUs go to PCIe switches.

We showed all of these components in our hardware overview and the video, but it is fairly common to have one GPU with one NIC and SSD installed in these systems, plus also giving the CPUs their own NICs and SSDs. If you have been reading STH for years, you probably will notice that the GPU PCIe switches have two x16 connections to the CPU. This is a big change from old GPU platforms where switches would have a single PCIe x16 link to the CPU and it is something NVIDIA has been promoting heavily.
A Word on Management and Performance
Normally when we go through servers, we do big sections on management and performance. We have reviewed a ton of Supermicro servers, and have showed their IPMI management maybe over a hundred times at this point. Likewise, this is the third NVIDIA HGX H200 8-GPU platform we looked at in a 30-day period to close 2024. It started to get really redundant to show that NVIDIA has CFM specs for its HGX H200 assemblies and that manufacturers meet and exceed those specs to keep the GPUs cool and performing at their peak. We have also looked at the liquid-cooled version of this platform and covered how the performance was the same as the air-cooled platform. It is just getting too redundant at this point.

Instead, let me offer why this system is different at 8U. If we look at the system from the rear, the top fans and power supply/ fan rows spanning the top 4U are dedicated to cooling the NVIDIA HGX H200 8-GPU platform. That is it. Many other systems also try to cool NICs or other components in the same airflow stream as the ~6kW (or more) NVIDIA HGX H200 8-GPU tray. The bottom 4U’s top half is fed air from the front fans that can ingest air over part of the CPU heatsinks but also cool the PCIe switches and NICs.

Finally, the bottom PSU fans are cooling mostly the front SSDs, the M.2 SSDs, RAM, and a bit of the CPU heatsink’s output.
By far, that is the most impactful part of this system. While it does not necessarily increase the performance of the CPUs or GPUs, what it does is make the cooling layout very clean. Usually, this leads to low single-digit percentage improvement in power consumption over denser systems. Many data centers cannot handle 80-100kW+ racks, so five of these systems fit in around 60kW of capacity and there is little need for additional density. Instead, slightly more efficient cooling plus the space to make this perhaps the easiest-to- service HGX H200 platform are what you get for that 8U chassis.
Given that performance is often equal, power savings and serviceability are the two key performance metrics where this platform differentiates itself.
Next, let us discuss power.
Did SuperMicro say anything about how they ensure networking reliability, with the optics on the hot-aisle side? Optics are notoriously unreliable in GPU work (look up “network flaps”) with the hot aisle heat and increased dust both likely to be problems.