Storage.AI Project by SNIA Looks to Re-frame AI Storage Discussion

0
SNIA Storage.AI Graphic
SNIA Storage.AI Graphic

Today, the industry association, SNIA, is releasing a Storage.AI project that looks to reframe the AI storage discussion. Storage is becoming a bigger topic in AI clusters since keeping GPUs fed and working can have a multiple-billion-dollar ROI.

Storage.AI Project by SNIA Looks to Re-frame AI Storage Discussion

The current state is that most AI cluster shared storage is connected through the front-end north-south network. There are then faster networks with higher bandwidth to the nodes going to the xPUs for both scale-up and scale-out segments. In NVIDIA parlance, this would be scale-up NVLink, and scale-out would be Ethernet or InfiniBand.

SNIA Storage.AI Current State Storage
SNIA Storage.AI Current State Storage

Part of the idea behind this is to put storage onto the scale out network so that it can take advantage of higher performance pathways to the GPUs. Today, pre-processing often happens on the locally attached CPUs, but there will likely be a point where that pre-processing happens elsewhere.

SNIA Storage.AI Graphic
SNIA Storage.AI Graphic

Here is a quick look at what SNIA says Storage.AI enables. Since it is a list format, we will let folks read through it:

SNIA Storage.AI Enablement
SNIA Storage.AI Enablement

One notable aspect our Chief Analyst Patrick Kennedy pointed out was that when he heard of this is that there will be a tension with network topologies. If more switch ports on the scale-out network are required for high-performance storage, then that will increase the tiers of switching required for large clusters. At some point, that trade-off may make sense, but it is something the industry is very sensitive to.

Final Words

So far, AMD, Cisco, DDN, Dell, IBM, Intel, KIOXIA, Microchip, Micron, NetApp, Pure Storage, Samsung, Seagate, Solidigm, and WEKA have signed on. Notably absent is NVIDIA, and perhaps Cerebras. NVIDIA is the big AI company. Cerebras has an interesting AI storage and movement requirement given its WSE design. Aside from hardware companies, we hope in future versions to see big AI customers sign on to help guide the project.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.