The 2025 PCIe GPU in Server Guide

3

Standard Servers with PCIe GPUs

While the 8-GPU platform is really designed primarily for GPU compute, the prospects for AI extend beyond those types of servers. A great example of why organizations are deploying these types of platforms is because if you believe that AI will be in virtually every workflow, then the question becomes how to address that. Deploying servers without GPUs today means that the only option is to go off of a server and to an AI server. An alternative model is to add GPUs into traditional servers so that they can be used during parts of the workload that are best accelerated.

Supermicro SYS 212GB NR Front
Supermicro SYS 212GB NR Front

Like the 8-GPU systems, the GPUs used often range between the NVIDIA H100 NVL, H200 NVL, RTX PRO 6000 Blackwell, and L40S. A big difference with this is that typically in a 2U server one can only fit two GPUs side-by-side.

Supermicro SYS 212GB NR Rear
Supermicro SYS 212GB NR Rear

As a result, 4-way NVLink is less common in traditional servers as compared to finding one or two GPUs in each domain. Some are also deploying lower-power GPUs like the NVIDIA L4 add a smaller amount of GPU compute and GPU memory but at a lower power consumption and cost.

Supermicro SYS 212GB NR Server With Airflow Routing To NVIDIA H100 NVL
Supermicro SYS 212GB NR Server With Airflow Routing To NVIDIA H100 NVL

As an example, we showed the Supermicro SYS-212GB-NR. This is one of Supermicro’s Hyper line of high-end servers where one can add many different types of GPUs. The idea is that if AI becomes part of your workflow because the software you are running is increasingly implementing AI, then adding a GPU to the server can make sense to keep the AI inference local.

Supermicro MGX Inspired 2U GPU Server Front
Supermicro MGX Inspired 2U GPU Server Front

Supermicro also has 2U GPU servers that are inspired by the NVIDIA MGX architecture. We have looked at a number of these previously, but we saw a new Xeon-based design to house multiple GPUs while we were in the demo room.

Supermicro MGX Inspired 2U GPU Server CPU Memory Fans And GPUs
Supermicro MGX Inspired 2U GPU Server CPU Memory Fans And GPUs

Next, let us get to the high-density servers.

High-Density Servers with PCIe GPUs

In the video, we also showed high-density servers that can utilize PCIe GPUs. An example we showed was the Supermicro SuperBlade with NVIDIA L4 GPUs. The L4 is versatile because it is a low-profile GPU with minimal cooling needs.

Supermicro SuperBlade with NVIDIA L4
Supermicro SuperBlade with NVIDIA L4

Over the years, we have seen SuperBlade and other high-density Supermicro platforms take a range of GPUs from single-width low-profile GPUs to double-width GPUs. The reasoning is usually the same as with the Standard Servers, but just in a higher-density design.

Next, let us get to the edge.

3 COMMENTS

  1. So many GPUs. I’m more of questioning if I need to GPU servers today for them to be running next software releases in 8 quarters. If I don’t, will they be obsolete? I know that time’s comin’ but I don’t know when.

  2. First mentioned in Patrick’s article: “This is the NVIDIA MGX PCIe Switch Board with ConnectX-8 for 8x PCIe GPU Servers”.

    This article might have mentioned the GH200 NVL2 and even the GB200 (with MGX), for example SuperMicro has 1U racks with one or two of these APUs: ARS-111GL-NHR, ARS-111GL-NHR-LCC or ARS-111GL-DNHR-LCC etc. . That gives you the newer GPUs with more performance than the 6000 but far less cost than the 8x GPUs.

  3. In addition to the “AS-531AW-TC and SYS-532AW-C” mentioned on the last page, SuperMicro has many Intel options (and much fewer AMD; for “workstations” only ThreadRipper and no new EPYC Turin) such as the new SYS-551A-T whose chassis has room set aside to add a radiator (in addition to old chassis like the AS -3014TS-I).

    What’s really new is their SYS-751GE-TNRT, with dual Intel processors and up to four GPUs, in a custom pre-built system. What makes it different than previous tower workstations is that the motherboard X13DEG-QT splits the PCIe lanes in two, with half of them on one side of the CPUs and the rest of the lanes on the other side (instead of having all the PCIe lanes together on one side only). I presume that’s to shorten the copper traces on the motherboard and make retimers unnecessary, even with seven PCIe slots.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.