Using DPUs Hands-on Lab with the NVIDIA BlueField-2 DPU and VMware vSphere Demo
Many VMware administrators are familiar with one of the biggest basic networking challenges in a virtualized environment. Traditionally, one can use the vmxnet3 driver for networking and get all of the benefits of having the hypervisor manage the network stack. That provides the ability to do things like live migrations but at the cost of performance. For performance-oriented networking, many use pass-through NICs that provide performance, but present challenges to migration. With UPT (Uniform Pass Through) we get the ability to do migrations with higher performance levels close to pass-through. The challenge with UPT is that we need the hardware and software stack to support it, but in this BlueField-2 DPU environment, we finally have that.
The demo environment we are using today is part of NVIDIA LaunchPad. If you are looking at deploying a DPU-based vSphere solution, you can request access to the environment and go through this demo. There are limited spaces due to how much hardware is tied up in each environment. Still, the hope was by STH doing this, we could get folks to see what the solution looks like.
As you can see, the nodes provided are Dell PowerEdge R750’s with Intel Xeon Gold 6354 CPUs, and most importantly under the DPU category of hardware, we have the NVIDIA BlueField-2. As a quick note, the PowerEdge R760 review on STH will be live fairly soon.
One of the first things we need to do is to create a Distributed Switch. Ours is a bit different since we are configuring the switch with the BlueField DPUs. There are fairly standard steps like adding our three hosts.
We then need to select the adapters. The Dell PowerEdge R750’s have other NICs. Since we are making a DPU-enabled virtual switch, the other NICs are listed as incompatible.
The BlueField-2 DPUs are listed as compatible.
As we can see, the switch now has BlueField capabilities.
Something that NVIDIA does not have in its demo, but is a big differentiation point, is that VMware is building a BlueField-specific integration into its environment to make this all work. While VMware initially announced Intel DPU support, and there are many other DPU vendors out there, VMware only has integrations for NVIDIA and AMD DPUs at this point.
The next step is going to VMware NSX. DPUs in many ways make the NSX vision of managing networking off of the network switch more of a reality.
Here we can see that we have a NSX overlay that is being powered by the BlueField-2 DPUs.
One important aspect is that we need each node to have a BlueField-2 DPU for this to work. If it does not, then we do not have the hardware capabilities to make the overlay work with the underlying hardware. This is an important point. If you are planning to deploy DPUs later in 2023 or 2024, but you want the nodes to participate that you are deploying in early 2023, then those nodes need to have BlueField-2 cards today, even if they are just being used as ConnectX-6 networking until the new capabilities are being used.
With the NSX overlay complete, we can go back to our vSphere environment.
Now we can create a VM that we are going to call “STH Test”. Here we can see the first network adapter does not have the option, but Network adapter 2 has the option to Use UPT Support. We also will have this adapter connected to the NSX overlay.
Looking at the VM we just created, we will notice quickly that the UPT is not activated.
Going into the Ubuntu instance, we have an older driver 184.108.40.206-k-NAPI. That version does not support UPT for the Ubuntu VM, so we need a newer version.
Upgrading to the 220.127.116.11-k-NAPI version gives us UPT capabilities
Looking at the VM again, our UPT warning is now gone because the VM’s driver now supports UPT.
With UPT enabled, we can still migrate using the networks we set up earlier.
The setup is not automatic, and it takes some time to work through, but it is fairly straightforward and if you do this LaunchPad demo, NVIDIA has documentation for you.
Next, let us take a look at the performance and impacts of this.