Edge Servers with PCIe GPUs
Edge servers present a different opportunity. Applications such as computer vision become more prevalent at the edge. A great example of this is in many retail locations where self-checkout is powered by edge servers with GPUs. Other typical applications in retail can be inventory analytics, shopper analytics, and more.

In the video, we showed an example with two NVIDIA L4 GPUs, the Supermicro SYS-E403-14B-FRN2T.

These servers are often highly power and space-constrained so having a 75W TDP or lower single-width low-profile GPU is the go-to.

Beyond the L4 GPUs, there are other edge use cases from network infrastructure to smart cities where larger GPUs can be utilized, often with higher-end networking.
Workstations with PCIe GPUs
Workstations have become hot topics in the era of AI. Folks want to develop AI tools locally. Perhaps the bigger shift will be that as AI becomes bigger parts of everyday workflows, having bigger GPUs can yield out-sized productivity gains.

When NVIDIA launched the RTX 6000 PRO Blackwell there were three versions. One was a 600W version that was designed to provide the maximum performance in a single PCIe card slot. There were then two double-width cards. One is a 300W actively cooled version. The other is the passively cooled version that we often see in the 8-GPU systems.

Recently we reviewed the Supermicro AS-2115HV-TNRT, a 2U server that can handle up to four double-width GPUs. The innovation here is that most other workstations on the market, even if they can be converted to 4U or 5U workstations, only handle up to three GPUs. With this system, we can get up to four GPUs in systems, along with IPMI for management, and then pack them into data center racks.

Supermicro also has other options such as the AS-531AW-TC and SYS-532AW-C that are designed to handle either a single 600W NVIDIA RTX PRO 6000 or multiple 300W versions like the Max-Q edition.
Final Words
Ultimately, if you believe in AI, and are using new tools daily, then the concept that AI is going to be part of most workflows going forward will seem very familiar. We managed to show a number of GPUs and their use cases and how to deploy them. AI is not just going to happen in enormous AI factories. Latency needs, workflow needs, data security requirements, and even deployment preferences are going to push GPUs into most servers going forward.

We spend a lot of time focusing on the large AI cluster deployments, but the writing is on the wall. As we transition to an era of AI GPUs are coming to many other locations and server form factors. It felt like it was time to show folks a sample of different options in different categories. Over time, there will be new GPUs, new networking, and new architectures, but hopefully, this helps frame some of the common use cases and deployments for STH readers today.
So many GPUs. I’m more of questioning if I need to GPU servers today for them to be running next software releases in 8 quarters. If I don’t, will they be obsolete? I know that time’s comin’ but I don’t know when.
First mentioned in Patrick’s article: “This is the NVIDIA MGX PCIe Switch Board with ConnectX-8 for 8x PCIe GPU Servers”.
This article might have mentioned the GH200 NVL2 and even the GB200 (with MGX), for example SuperMicro has 1U racks with one or two of these APUs: ARS-111GL-NHR, ARS-111GL-NHR-LCC or ARS-111GL-DNHR-LCC etc. . That gives you the newer GPUs with more performance than the 6000 but far less cost than the 8x GPUs.
In addition to the “AS-531AW-TC and SYS-532AW-C” mentioned on the last page, SuperMicro has many Intel options (and much fewer AMD; for “workstations” only ThreadRipper and no new EPYC Turin) such as the new SYS-551A-T whose chassis has room set aside to add a radiator (in addition to old chassis like the AS -3014TS-I).
What’s really new is their SYS-751GE-TNRT, with dual Intel processors and up to four GPUs, in a custom pre-built system. What makes it different than previous tower workstations is that the motherboard X13DEG-QT splits the PCIe lanes in two, with half of them on one side of the CPUs and the rest of the lanes on the other side (instead of having all the PCIe lanes together on one side only). I presume that’s to shorten the copper traces on the motherboard and make retimers unnecessary, even with seven PCIe slots.