NVIDIA A40 Performance
In terms of performance, the NVIDIA A40 has been out for quite some time at this point, but we just wanted to show a different view than is publicly out there using multi-GPU systems. We noticed slight variances between either GPUs, or GPUs in the larger 8x and 10x GPU systems that we reviewed such as the ASUS ESC8000A-E11.
Also, the Tyan Thunder HX FT83A-B7129 saw variances across our quick deep learning benchmarks.
In terms of performance, this is a rough guide, but a PCIe NVIDIA A100 doing training will be around twice as fast, and the top-end SXM4 80GB 500W A100’s as we tested in Liquid Cooling Next-Gen Servers Getting Hands-on with 3 Options are roughly about 2.4-2.5x as fast with something smaller like ResNet-50 training but that delta can go up as one uses more memory and NVLINK with larger models.
Still, the real reason one uses NVIDIA A40’s is not necessarily for the training performance. Instead, they tend to sell for much less than a NVIDIA A100 SXM4 solution, while at the same time providing vGPU features for solutions such as VDI/ virtual workstations.
While we did not have NVLINK bridges, here is what eight of these GPUs look like in an AMD EPYC system without PCIe switches. As we can see, we simply have an 8x PCIe link topology.
This is certainly different from the NVIDIA M40‘s that had four GPUs per card from a few generations prior. Here are eight M40 GPUs, using two M40 PCIe cards (four GPUs per card):
One of the great things is that there is no longer a need for a more complex PCIe architecture for a VDI card like this.
Next, let us get to power consumption before getting to our final words.
NVIDIA A40 Power Consumption
The power consumption of NVIDIA’s data center GPUs tends to be very different from Intel CPUs, and sometimes AMD CPUs. Specifically, the cards have a power cap and if you run them at their maximum, they will effectively try to hit their caps (you can also use nvidia-smi to set lower caps for lower power/ performance operation.) So saying that these consume 296-300W at 100% utilization is very safe. Instead, we wanted to show the idle power consumption of the 16x units we had in two different servers. As passively cooled cards, these do not have fans to spin up, so the idle is just for the compute resources. Here is one set:
Here is another.
Based on this, we have roughly 25-31W at idle for the sixteen different GPUs. That is only part of the equation though. Between power supplies and fans used, there can be 20%+ power used in a system due to the chassis fans running to cool the GPUs. We had a piece where we investigated several factors that impact server power consumption.
An important aspect when adding a NVIDIA A40 to a system is that the GPU itself uses 300W, but in the context of an overall system, it can easily add 360-400W of power draw at the PDU depending on the server’s power supply efficiency as well as the cooling. Given how much power is being consumed by cooling, many next-gen systems, and even A100 systems are turning to liquid cooling, and that is why we have been focusing on that recently.
If you are looking for a NVIDIA GPU to do VDI workloads, then this is really the top option for the data center where one can handle passive cooling. 48GB is a lot of memory to split among VMs. The NVIDIA A40 is an Ampere part, but realistically, higher-end training will have folks look to the NVIDIA A100’s with NVLINK. The other side is that the capable GPUs can handle VDI workloads during the day, then use the GPUs for GPU compute in the evening.
For our regular readers wondering, yes, this was done just before GTC 2022 as we had a few backlog items to get through before then.