Touring the Center of the Internet and an AI Data Center at Equinix Silicon Valley

1

Equinix SV11 in Action: Housing the NVIDIA DGX B200 SuperPOD

Diving deeper into SV11, this data center is primarily designed to house the newest AI systems and hyperscale clusters. This means being able to accommodate the massive power and cooling needs of a rack of GPUs. NVIDIA has a cluster here that we are calling “the” NVIDIA DGX B200 SuperPOD because it is the cluster that runs many of the global GTC demos you see when something is running remotely, and Jensen is on stage presenting. Although you rarely get to see the physical system, we have certainly seen its outputs on stage.

SV11's DGX B200 SuperPod
Equinix SV11 NVIDIA DGX B200 SuperPOD with Charlie Boyle and Patrick Kennedy

Comprised of eight NVIDIA GB200 NVL72 racks and associated hardware, the SuperPOD is the first level of scaling out clustering in NVIDIA’s ecosystem.

DGX B200 SuperPod Networking
NVIDIA DGX B200 SuperPOD Networking

NVIDIA has enormous GPUs clusters that it uses to develop and support its products. This cluster is a bit different because it is specifically located at the Equinix Silicon Valley campus. With the extensive connectivity to many carriers and the presence of many large organizations on campus, this cluster is also used for proof-of-concept work with partners and customers. Since so many have resources on the campus, those fiber lines we showed you earlier help provide connectivity to this SuperPOD.

NVIDIA DGX SuperPOD NetApp 2 Equinix SV11
NVIDIA DGX SuperPOD NetApp 2 Equinix SV11

Equinix is also NVIDIA’s global deployment partner for SuperPODs, so this Silicon Valley installation allows NVIDIA to validate the SuperPOD it before replicating it at other customer sites and Equinix data centers elsewhere.

DGX B200 SuperPod Cooling
NVIDIA DGX B200 SuperPOD Cooling

As the NVIDIA DGX B200 SuperPOD requires liquid cooling, this was also a critical factor for NVIDIA in terms of hosting needs. As part of the SuperPOD design, NVIDIA opted for a high-density layout to enable short copper runs between servers, thereby avoiding optical runs and the power penalty associated with additional optical transceivers. As a result, liquid cooling is a necessity when you look at a few racks that can use up to 1MW of power.

NVIDIA DGX SuperPOD NVLink Spine Rear Equinix SV11
NVIDIA DGX SuperPOD NVLink Spine Rear Equinix SV11

Liquid cooling is a more efficient cooling option overall. With the higher heat density of water (and other liquids) versus air, NVIDIA (and Equinix) can spend less power on cooling.

More DGX B200 SuperPod Liquid Cooling
Equinix SV11 NVIDIA DGX B200 SuperPOD Power and Networking Overhead

In the video, we had Charlie Boyle, VP, DGX Systems at NVIDIA. Charlie is one of my favorite folks to talk to since he explains not just what is in a system, but the why behind it.

NVIDIA DGX SuperPOD Aisle Equinix SV11
NVIDIA DGX SuperPOD Aisle Equinix SV11

For example, in the photo above, Charlie explains why the liquid-cooling system for the cluster uses hoses rather than welded pipes.

NVIDIA DGX SuperPOD Liquid Cooling Hot And Cold Equinix SV11
NVIDIA DGX SuperPOD Liquid Cooling Hot And Cold Equinix SV11

He was telling me that to get the correct bend in a welded pipe was not an issue to have done by a skilled tradesman once, but getting thousands of them perfect for large-scale AI clusters would unlikely.

DGX B200 SuperPod Optical Networking
Equinix SV11 NVIDIA DGX B200 SuperPOD Optical Networking

Also, because this was STH, it was interesting to see the networking and other components. Folks often talk about the GPUs, but really it is the interconnects that turn a single GPU package into a large AI cluster that is driving today’s AI boom.

Since I know everyone likes to see fiber, here is the rear of the SuperPOD.

NVIDIA DGX SuperPOD Fiber Cables Equinix SV11
NVIDIA DGX SuperPOD Fiber Cables Equinix SV11

NVIDIA has multiple storage vendors colocated with this SuperPOD, as it is used for POC work.

NVIDIA DGX SuperPOD NetApp 1 Equinix SV11
NVIDIA DGX SuperPOD NetApp 1 Equinix SV11

We spotted both NetApp and DDN storage gear here.

NVIDIA DGX SuperPOD DDN Equinix SV11
NVIDIA DGX SuperPOD DDN Equinix SV11

Folks often overlook that, even though the main focus is on the GPU compute blades and interconnects, AI clusters also have traditional compute nodes next to them, often with lots of networking.

NVIDIA DGX SuperPOD General Purpose Compute Front Equinix SV11
NVIDIA DGX SuperPOD General Purpose Compute Front Equinix SV11

Something that is obvious when walking through the SuperPOD versus the floor of SV1 is how this entire cluster was co-designed to work together with very specific requirements, but also much higher density. It is a sharp contrast to the density of what is running in SV1.

1 COMMENT

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.