Building Our Office Storage for the NVIDIA GB10 Agent AI Cluster

0

Leveling-up the AI Storage Cluster Network

At this point, we realized that we wanted all-flash. The performance hit of having to hit hard drives was just too much. Replicating models on 10+ local AI machines did not make much sense given the cost. Constantly shifting models to local SSDs would cause us to spend a lot more on each machine for storage, and managing models on a lower-cost 1TB local storage standard was a pain since the models we were working with took up >5% of the local SSD capacity. The challenge was really that we needed to go faster, a lot faster. At the same time, we did not want to pull out the NVIDIA SN5610 51.2Tbps switch that uses 800Gbps ports (getting down to 10Gbps/ 25Gbps is a nightmare), and uses 900W-2kW in typical operation. Those high-end switches have all of the features we could want for high-end AI clusters, but they are not what you want to sit next to your desk for NVIDIA GB10, NVIDIA RTX Pro 6000 Blackwell/ 6000 Ada, AMD Ryzen AI Max+ 395, or Apple Mac Studio AI clusters.

MikroTik CRS812 8DS 2DQ 2DDQ 400G QSFP56 00 Port 2
MikroTik CRS812 8DS 2DQ 2DDQ 400G QSFP56 00 Port 2

This all brought us to the MikroTik CRS812-8DS-2DQ-2DDQ-RM. This is a reasonably priced switch, not much more than the QNAP, but it has the advantage of using higher-end switch silicon as a 1.6Tbps class switch. This means we could break-out the 400GbE ports (faster than the PCIe Gen4 x16 slots in our NAS) to 200Gbps ports for our NVIDIA GB10’s.

Dell Pro Max With GB10 QSFP Port
Dell Pro Max With GB10 QSFP Port

We could then upgrade to using theĀ NVIDIA ConnectX-7 Quad Port 50GbE SFP56 adapters in our high-end GPU workstations. Those are only PCIe Gen4 x8 cards, so we are limited to 100Gbps total, but importantly we can do that in two SFP56 ports. While it is possible to do 100Gbps using QSFP28 or four SFP28 NRZ channels, it is a more efficient use of ports to use fewer lanes of PAM4 signaling on the network side.

NVIDIA ConnectX 7 Quad Port 50GbE SFP56 Adapter 5
NVIDIA ConnectX 7 Quad Port 50GbE SFP56 Adapter 5

The advantage would be that we would then have single connections to the switch, and could dedicate the NVIDIA GB10 10Gbase-T ports to application and management traffic. To be clear, we can already use the CRS812 using slower signaling, but this is an awesome capability as it gives us a low-cost option for today, then an onramp to higher-speeds for our edge AI agent cluster in the future.

Perhaps the most exciting part about this is that we could get high-speed and high-capacity storage, for an edge AI cluster, in a box that is relatively low power and quiet. In the data center, companies like VAST Data and others are using high-capacity SSDs for massive storage tiers for high-end AI clusters. A challenge has been scaling that down to a smaller cluster where the total cost may be closer to one or two data center GPUs.

QNAP TS H1290FX SSD Tray 4
QNAP TS H1290FX SSD Tray 4

Having the Solidigm D5-P5336 SSDs and QLC NAND was really the game-changer here. We originally built the AI storage based on a ZFS hard drive NAS, and while we could get into the 5-10Gbps of performance from that, and the form factor worked, in terms of the overall workflow, we were losing a lot of performance whenever we tried to do a task like running a prompt in one model, then unloading that model from memory, and trying it with a different model. Folks are doing this all the time especially when results are not coming back optimally.

QNAP TS H1290FX 30.72TB Drives Solidigm
QNAP TS H1290FX 30.72TB Drives Solidigm

Just to give you some sense of the impact of the solution that we had, we ran some workflows on some of the systems we had on hand, and the performance delta was palpable. Models would load anywhere from 30 seconds to 4-5 minutes faster than on our disk-based QNAP NAS all due to the storage throughput. A few minutes here and there adds up, especially when you are waiting for a model to load.

Even with the fast networking, however, we still needed to address another part of this edge AI cluster, the low-speed path.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.