NVIDIA A10 A16 A4000 and A5000 Launched

NVIDIA A16 Front View
NVIDIA A16 Front View

NVIDIA has four new GPU models based on Ampere launching at GTC 2021, and we are just focusing on the PCIe versions, not the notebook versions. Let us get into the details of the quartet of new GPUs. Please note, this is being written during a keynote so we may update this as more information is released.

NVIDIA A10 and A16 GPUs for Data Centers

The NVIDIA A10 and A16 lack display outputs, so these are more of the data center GPUs. In previous generations, these would have had names such as “NVIDIA Tesla” or “NVIDIA GRID” but those have been retired as NVIDIA blends its data center and workstation lines.


The NVIDIA A10 is a single slot GPU designed to offer an uplift above the current NVIDIA T4 but with a larger and higher power footprint. NVIDIA had a full-height 150W version of the T4 ready that was not released publicly and this seems to follow in that line.

FP32 31.2 TF
TF32 Tensor Core 62.5 TF | 125 TF*
BFLOAT16 Tensor Core 125 TF | 250 TF*
FP16 Tensor Core 125 TF | 250 TF*
INT8 Tensor Core 250 TOPS | 500 TOPS*
INT4 Tensor Core 500 TOPS | 1000 TOPS*
RT Cores 72
Encode / Decode 1 encoder

1 decoder (+AV1 decode)

GPU Memory 24 GB GDDR6
GPU Memory Bandwidth 600 GB/s
Interconnect PCIe Gen4: 64 GB/s
Form Factor 1-slot FHFL
Max TDP Power 150W
vGPU Software Support NVIDIA vPC/vApps, NVIDIA RTXvWS,

NVIDIA Virtual Compute Server (vCS)

Secure and Measured Boot with Hardware Root of Trust Yes
NEBS Ready Level 3
Power Connector PEX 8-pin

The * denote sparsity so NVIDIA is getting aggressive with performance claims here.

Overall, this is not going to be the fastest GPU, but if it is a single-slot GPU that is simply needed in some systems or is desirable compared to dual-slot GPUs like the A100’s we saw in our recent ASUS RS720A-E11-RS24U Review.


The second GPU being announced is the NVIDIA A16. This is a 4x Ampere GPU with 16GB of memory per GPU on a single PCIe card. If you saw our NVIDIA GRID M40 with 4x Maxwell GPUs and 16GB RAM cards piece you will see the lineage back to Maxwell.

Feature Type A16
GPUs/board Architecture 4 GPUs on one board – NVIDIA Ampere
Memory Size 64 GB GDDR6 (16 GB per GPU)
vGPU Software Support NVIDIA Virtual PC (vPC)

NVIDIA Virtual Applications (vApps) NVIDIA RTX Workstation (vWS) NVIDIA Virtual Compute Server (vCS)

vGPU Profiles (GB) 1, 2, 4, 8, 16
Media Acceleration 4x NVENC, 8x NVDEC
Video Codec (Encode) H.264/H.265 (+4:4:4)
Form Factor FHFL Dual Slot
Max Power Consumption 250W
Graphics Bus PCIe Gen 4
NEBS ready Yes
Power Connector 8-pin CPU

The primary market for this type of GPU is in the VDI market. One can have smaller GPUs and give VMs physical GPUs or parse out each of these smaller GPUs to multiple users.

NVIDIA A4000 and A5000 GPUs

One of the big differentiators between the A10 and A16 GPUs versus these A4000 and A5000 GPUs is the fact that the A10/ A16 do not have display outputs while the A4000 and A5000 do. We can think of the A4000 and A5000 GPUs as coming from the line formerly called “NVIDIA Quadro”.


The NVIDIA A4000 is the lower-end GPU of the two. At only 140W and with less memory than the A10, this also has four DisplayPort outputs.

Architecture NVIDIA Ampere Architecture
Foundry Samsung
Process Size 8nm
Transistors 17.4billion
Die Size 392.5 mm2
CUDA Parallel Processing cores 6,144
NVIDIA Tensor Cores 192
NVIDIA RT Cores 48
Single-Precision Performance1 19.2 TFLOPS
RT Core Performance1 37.4 TFLOPS
Tensor Performance1 153.4 TFLOPS
GPU Memory 16 GB GDDR6 with ECC
Memory Interface 256-bit
Memory Bandwidth 448 GB/s
Max Power Consumption 140W
Graphics Bus PCI Express 4.0 x16
Display Connectors DP 1.4 (4)
Form Factor 4.4” H x 9.5” L Single Slot
Product Weight 500 g
Thermal Solution Active
NVIDIA® 3D Vision® and 3D Vision Pro Support via 3 pin mini DIN
Frame lock Compatible (with Quadro Sync II)
Power Connector 1x 6-pin PCIe
NVENC | NVDEC 1x | 1x (+AV1 decode)

Perhaps the big one here is that this GPU has an active cooler so it is more aligned to the workstation market versus the A10 and A16 which are more for data centers.


The NVIDIA A5000 is the bigger of the two GPUs being launched for the workstation market. This has more compute elements, more memory, and higher power consumption than the A4000.

Architecture NVIDIA Ampere Architecture
Foundry Samsung
Process Size 8nm
Transistors 28.3 billion
Die Size 628.4 mm2
CUDA Parallel Processing cores 8,192
NVIDIA Tensor Cores 256
NVIDIA RT Cores 64
Single-Precision Performance1 27.8 TFLOPS
RT Core Performance1 54.2 TFLOPS
Tensor Performance1 222.2 TFLOPS
GPU Memory 24 GB GDDR6 with ECC
Memory Interface 384-bit
Memory Bandwidth 768 GB/s
Max Power Consumption 230W
Graphics Bus PCI Express 4.0 x16
Display Connectors DP 1.4 (4) 3
Form Factor 4.4” H x 10.5” L Dual Slot
Product Weight 1.025 kg
Thermal Solution Active
vGPU Software Support4 NVIDIA ® Virtual PC/Virtual Applications (vPC/vApps), NVIDIA RTX® Virtual Workstation

(vWS), NVIDIA Virtual Compute Server (vCS)3

vGPU Profiles Supported See vGPU Pricing & Licensing Guide
NVIDIA® 3D Vision® and 3D Vision Pro Support via 3 pin mini DIN
Frame lock Compatible (with Quadro Sync II)
NVLink 2-way low profile (2-slot and 3-slot bridges)

Connect 2x RTX A5000

NVLink Interconnect 112.5 GB/s (bidirectional)
Power Connector 1x 8-pin PCIe
NVENC | NVDEC 1x | 2x (+AV1 decode)

We will note here that virtualization support for the RTX A5000 GPU NVIDIA says will be available in an upcoming NVIDIA virtual GPU (vGPU) release.

A quick note here is that the A5000 also has display outputs and an active cooler like the A4000.

Honorable Mentions

We also wanted to point out that NVIDIA has notebook versions of the A4000 and A5000 GPUs, along with the A3000 and T300/ T1200 GPUs launching today. We do not normally cover these, but we will simply mention them.

Final Words

This is a smart move by NVIDIA. It can release differentiated GPUs into the data center and professional markets thereby selling GPUs at higher ASPs than on the consumer side. Given the global GPU shortage, we expect all of these models to sell well.

Update 2021-04-14: We have a post-GTC 2021 keynote video if you want to hear about some of the other announcements:



  1. I can see it now. A headless Threadripper Pro box using GB or Asus board with seven A10 single slot GPU compute cards. Who says good things don’t come in slim packages?

  2. Why didn’t Nvidia announce those GPUs during GTC 2021 keynote? Also missing is 3080Ti/3070Ti news.

    Looking at the A4000, a 6pin connector is a disappointment, it means it is pulling 65W from PCI-e bus, which makes multiple single-slot A4000 a no-no. Even RTX4000 had an 8pin connector.

    Meanwhile the A16 is basically 4x low power A4000s on a PCI-e switch. So Nvidia is forcing multiGPU GA104 users to use A16 instead of multiple A4000s. The A16 is a monster gpu for low power inferencing.

  3. Now that I think about it. The A16 could be 4x 128bit unannounced GA107 or it could be 4x 256bit GA104s. Hard to tell at this time. It would be cool if it were 4x GA104s, but 250W TDP tells me it is more likely 4x GA107 with 2x 2gb samsung chips per vram channel

  4. emerth,

    or for computer aided tomography 3D in a VM when your main display is attached to a custom board that keeps independent LUTs and gamma for each application as offered by Eizo, Barxo, Totoku / JVC and LG, just to list the 4800 by 3200 px 12.9MP 3:2 screens that I need to fall off the back of a truck and into my desk…

  5. Hi Sales,

    Good day.
    This is Ivan from Contacthings solution base in Penang.
    Our customer is looking for Nvidia A10 GPU, can you quoted me with reseller price, ex-work and share lead time.
    NVIDIA A10 GPU Computing Accelerator – 24GB GDDR6 – PCIe 4.0 x16 – Passive Cooler (thinkmate.com)



Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.