The Xilinx Alveo U50 is a PCIe Gen4 (and CCIX) capable FPGA accelerator card that the company hopes will find its way into a variety of applications. With a single slot, low profile design and not requiring extra PCIe power, the newest Alveo will fit into many servers that the company could previously not reach. This is Xilinx’s direct competitor to the NVIDIA Tesla T4 inferencing GPU.
Xilinx Alveo U50 Overview
The Xilinx Alveo U50 is still an UltraScale+ FPGA product, not the upcoming Versal ACAP. It is a low profile accelerator card with the FPGA, 8GB of HBM2 memory, and a QSFP28 connection for 100GbE applications. With a 75W TDP and 50W TDP for typical use, it is designed to be a competitive accelerator versus an NVIDIA Tesla T4. Xilinx did not provide pricing, but we expect it to compete with the T4 on price.
Xilinx has a number of partners and a number of domains that it is seeing take advantage of the Alveo line. The company is seeing ecosystem pickup with the Alveo line for a straightforward reason: PCIe cards are easy to consume. One can simply plug a Xilinx Alveo U50 in a server like the Dell EMC PowerEdge R740xd and have a device with a Linux driver ready to work. The traditional FPGA model was that one needed to get a FPGA, do some design for the PCB and basic I/O (e.g. decide which FPGA package pins would go to PCIe), get memory controller IP and network IP, and then start to build their application. That had a long time to value. With these pre-baked solutions, more companies can focus on domain acceleration rather than just accelerators cards running.
Xilinx has several proof points, but we wanted to highlight the NVMeoF example. Here, using a Xilinx Alveo U50, a NVMeoF request can come in over the 100GbE interface, the FPGA can offload NVMeoF functions, it can do peer to peer NVMe access, and it can do features such as compression and encryption in-line at low latency.
The advantage here is that one can stay largely out of the host system’s memory structures and provide much lower latency as a result. This is all while keeping CPU cores working on other applications.
Xilinx Alveo U50 Designed for Microservices
Xilinx is also quick to point out another Alveo U50 trend. Customers who deploy Alveo U50 PCIe cards on-prem or at the edge can bring Xilinx-based FPGA workloads to the cloud as well. Xilinx has been working to expand this footprint mostly with AWS, but it gives the ability for a customer to get FPGA acceleration either on its own metal or someone else’s cloud.
In the microservices world, the Xilinx Alveo U50 also has a device driver that is extremely important. A Xilinx FPGA can be advertised on a host and a container placed on that host to take advantage of the FPGA. For a flexible logic accelerator, the ability to easily pass workloads onto and off of systems with accelerators is important.
To give one a sense of how this can be used, a microservice may be under demand that requires scaling. Kubernetes can then look for a node with a Xilinx Alveo U50. It can then place the container onto the node to use the FPGA while also delivering the FPGA configuration payload. By doing this, one is leveraging the flexibility of the FPGA with the power of Kubernetes to orchestrate across potentially heterogeneous nodes.
Xilinx Alveo Lineup August 2019
Just to give a sense of where this fits, this is the first Xilinx Alveo product that is a single slot/ low profile solution. That means it physically fits in more servers. Further, with only a 75W TDP, it can be powered by the slot and does not require a separate PCIe power input.
The Xilinx Alveo U50 is also a PCIe Gen4 and CCIX enabled accelerator like the Alveo U280. We see it as no coincidence that Xilinx is releasing a PCIe Gen4 accelerator the day before the AMD EPYC 7002 “Rome” launch, the first x86 platform with PCIe Gen4.
By using HBM2 memory instead of DDR4 like the other Alveo cards, Xilinx sacrifices capacity but gains a very flexible form factor.
We are still not at the point where one can order servers with Xilinx Alveo U50 cards, and seamlessly run 100GbE networking on the device while immediately integrating acceleration across a broad portfolio of workloads. This is coming, and we see the Xilinx to Acquire SolarFlare for SmartNIC Capabilities as perhaps the next step in that evolution. Still, providing standardized form factors can reduce time to value by several quarters and makes the solution deployable in places the Xilinx Alveo line could previously not reach.