AMD Xilinx VCK5000 AI Accelerator Launched

2
AMD Xilinx VCK5000 Cover
AMD Xilinx VCK5000 Cover

The new AMD Xilinx VCK5000 is an accelerator specifically designed for AI development. For those who have not seen, we are going to be using AMD-Xilinx for some time post-acquisition. The card itself is designed to offer AI acceleration via Xilinx’s accelerators and programmable logic.

AMD Xilinx VCK5000 AI Accelerator Launched

In terms of the overview, this is a 7nm Versal ACAP. As part of the ACAP, we not only get traditional programmable logic, but we also get AI accelerators. Xilinx says its new solution is designed to hit up to 125 INT8 TOPS. INT8 is very popular and is something we specifically looked at in the AWS EC2 m6 Instances Why Acceleration Matters piece.

AMD Xilinx VCK5000 Product Overview
AMD Xilinx VCK5000 Product Overview

Beyond just the AI accelerators and programmable logic, Xilinx’s value proposition goes a step further. Xilinx can use the local FPGA memory and programmable logic to do multiple actions. For video, this may mean ingesting and decoding a 4K video stream, doing coarse object detection (determining which objects warrant further classification), then doing other data transformation and inference tasks.

AMD Xilinx VCK5000 Real AI Deployment In Multiple Steps
AMD Xilinx VCK5000 Real AI Deployment In Multiple Steps

Since Xilinx can do all of that on its FPGA, it is able to claim a higher utilization. It is using the term “Dark Silicon” but the other way to think about it is claimed TOPS versus achieved TOPS. Xilinx took NVIDIA’s published numbers and performance claims and says that while theoretical TOPS may sound great, the cards are actually achieving less than half of that theoretical limit.

AMD Xilinx VCK5000 Estimated Dark TOPS
AMD Xilinx VCK5000 Estimated Dark TOPS

AMD Xilinx claims that the new VCK5000 can hit up to 90% of its theoretical TOPS. Further, it thinks that there is room to get that percentage even higher.

AMD Xilinx VCK5000 Estimated Dark TOPS Versus NVIDIA
AMD Xilinx VCK5000 Estimated Dark TOPS Versus NVIDIA

This is one that it worth it to view critically. For example, many would use the A100 PCIe for inference at lower power and the SRP versus street pricing is always different. The VCK5000 is in lower power and performance mode here since it can scale to 225W.

AMD Xilinx VCK5000 Performance Per Watt
AMD Xilinx VCK5000 Performance Per Watt

Still, this is a key development in showing Xilinx’s story in the AI space, and a big value that we expect AMD to drive further.

Final Words

Xilinx also has its Vitis frameworks and other development tools so that it can be plugged into Tensorflow, Pytorch, and other frameworks. There are also companies like Aupera building video analytics frameworks to help with some of the tasks like ingesting and processing the video, getting it ready for the inferencing pipeline.

AMD Xilinx VCK5000 Availability
AMD Xilinx VCK5000 Availability

These should be available on the Xilinx website for $2,745, however, we suspect Xilinx will have some sort of developer program around them as well.

2 COMMENTS

  1. One of the amazing things about AMDs current GPU offerings (other than the hardware itself) is the fully open-source ROCm stack. My suspicion is that being able to customize the open source was an important feature for the recent national exascale supercomputer contracts.

    To what extent is the Xilinx software open source? Are there plans to follow the same open-source model as with the AMD GPUs?

  2. Is ROCm currently focused only on GPUs? I recall that Xilinx developed a triSYCL project, and someone was attempting to make it work as a dpc++ backend. Would AMD duplicate all that effort to make it work with ROCm, or just move to oneAPI/dpc++?

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.