The Magic – This is Actually a Multi-DPU ZFS and iSCSI Demo
Now for the more exciting part. This may be the mental model you have of the setup, but that would be incorrect.
You will notice that the drives are showing up in the system as /dev/sda, /dev/sdb, and /dev/sdc. Normally a NVMe SSD, like the ones we are using would be labeled as “nvme0n1” or something like that instead of “sda”. That is because there is something even more relevant to the DPU model going on here: there are actually two DPUs being used.
The first DPU is exporting the three 960GB NVMe SSDs as iSCSI targets from the AIC JBOX. The second DPU that we have been showing is in the host AIC server, and is serving as the iSCSI initiator and then handling ZFS duties. This is actually what the setup we are using looks like:
What we essentially have is two nodes doing iSCSI and ZFS without any x86 server involvement other than simply the second DPU getting power from the PCIe slot and cooling. Performance is pretty darn poor in this example because this is running totally on the out-of-band 1GbE management ports and not using the accelerators onboard the DPU.
Still, it works. That is exactly the point. Without any x86 server in the path, we manage to mount SSDs on DPU1 in the AIC JBOX. We then use DPU1 to expose that storage as block devices to the network. Multiple block devices, exposed by DPU1, are then picked up by DPU2, and a ZFS RAID array is created. This ZFS RAID array is then able to provide storage. Again, this is network storage without a traditional server on either end.
The real purpose of this piece is to show the concepts behind next-generation infrastructure with DPUs. We purposefully stayed off the accelerators and the 100GbE networking in the NVIDIA BlueField-2 to show the stark contrast of DPU infrastructure using concepts like iSCSI and ZFS that have been around for a long time and people generally understand well. With erasure coding offloads and more built-into next-generation DPUs, many of the older concepts of having RAID controllers locally are going to go away as the network storage model offloads that functionality. Still, the goal was to show folks DPUs in a way that is highly accessible.
The next step for this series is showing DPUs doing NVMeoF storage using the accelerators and offloads as well as the 100GbE ConnectX-6 network ports. The reason we used this ZFS and iSCSI setup is that it showed many of the same concepts, just without going through the higher performance software and hardware. We needed something to bridge the model folks are accustomed to, and the next-generation architecture and this was the idea. For the next version, we will have this:
Not only are we looking at NVIDIA’s solution, but we are going to take a look at solutions across the industry.