NVIDIA L4 24GB Review The Versatile AI Inference Card

12
NVIDIA L4 Front
NVIDIA L4 Front

Today we get a fun review, taking a look at the NVIDIA L4. While not NVIDIA’s fastest GPU by any means, the L4 is going to be a popular card. We know this, because it is effectively an update to the NVIDIA Tesla T4 (then they were still “Tesla” branded) we reviewed in 2019. The NVIDIA T4 has been wildly popular, and we expect the L4 to be equally so building on a proven format.

Before we get to the review, we just wanted to give a quick thanks to PNY for actually getting us a card for review. NVIDIA GPUs are very popular and PNY is NVIDIA’s major partner for professional cards. With that, let us get to the hardware.

NVIDIA L4 24GB Overview

The NVIDIA L4 is a low-profile, half-height GPU. That is perhaps the most important spec of the card since it allows the card to be installed in all kinds of servers.

NVIDIA L4 Front Connector Up
NVIDIA L4 Front Connector Up

The backplate is another small but important feature. These often get installed in risers that are removed for service. Having the backplate, helps ensure the card stays safe during service.

NVIDIA L4 Back Connector Up
NVIDIA L4 Back Connector Up

With NVIDIA’s Ampere and later design elements, the gold card is easily recognizable. Passive cooling on the lower-power card also helps ensure it can be used in a wide array of systems since it can just use chassis airflow.

NVIDIA L4 Angle
NVIDIA L4 Angle

Since this is a GPU designed primarily for AI inference (although there are some other data center GPU use cases it services) we do not have display outputs. As these cards get densely packed in servers due to their small size, having more display outputs can be an issue in some OSes as we found many years ago doing 8x GPU systems.

NVIDIA L4 PCIe Bracket Side
NVIDIA L4 PCIe Bracket Side

The new design has retention bracket mounting holes on the front of the card. One item that is not present is a GPU power connector. Being solely PCIe slot powered is another feature that helps this class of GPU be easily integrated into many types of servers.

NVIDIA L4 Rear
NVIDIA L4 Rear

Next, given the NVIDIA (Tesla) T4 popularity, we wanted to do a quick side-by-side so you can see differences given how the ease of integration into servers is a major feature.

NVIDIA L4 and T4 Side-by-Side

Here are the two cards. We did not have a T4 with a low profile bracket. The four we found all had full height. The low-profile brackets were all installed in data centers.

NVIDIA L4 And NVIDIA T4 Front 2
NVIDIA L4 And NVIDIA T4 Front 2

Here is the back side. We can see the NVIDIA L4 has much less NVIDIA branding than the old “NVIDIA Tesla T4”.

NVIDIA L4 And NVIDIA T4 Back
NVIDIA L4 And NVIDIA T4 Back

Both are a similar size, however, the ribbing on the L4 seems to make the front of the card slightly wider.

NVIDIA L4 And NVIDIA T4 Front 1
NVIDIA L4 And NVIDIA T4 Front 1

Here is a look at the airflow view if that helps. One can see that the new card has an improved thermal solution.

NVIDIA L4 And NVIDIA T4 Rear
NVIDIA L4 And NVIDIA T4 Rear

Next, let us get to key specs and then performance and power.

12 COMMENTS

  1. I believe this card is sized just about perfectly to fit into a TinyMiniMicro node. I just can’t justify spending thousands of dollars extra on a GPU like this for home use when I could use a consumer-oriented 3090 or 4090 for a fraction of the cost in a different system. That datacenter markup annoys me, but Nvidia certainly loves it. Obviously, the main problem would be cooling the card outside of a server chassis… so it would probably be thermally throttled, even if you pointed a fan at it, but for hobbyist use, it would probably still be better than anything else you can fit into a TMM node.

  2. Josh – there were 3D printed 40mm brackets for the T4. Those do not fit the L4 due to the size of the inlet side. My sense is there will be a new round for the L4 as they become more popular.

  3. On the GF/W metric it’s excellent. Also the GF/slot metric.

    Josh – why would you want to do that kind of thing in a tiny mini micro form factor? You could not find a form factor with worse thermal management challenges.

  4. Patrick Kennedy – Do you think you’ll ever see an Intel Data Center GPU Flex 140 to review? They supposedly don’t need a license to unlock SR-IOV capabilities. This will be interesting to compare against an L4 for VDI stuff.

  5. @Josh “That datacenter markup annoys me”
    Nvidia forbids to deploy Geforce to datacenter with their license agreement. IMO the existence of those products, its only raison d’être is fabricated from this licensing restriction. There is no reason to buy these in hobby usage.

  6. @emerth @yamamoto The reason to buy these in hobby usage and such a form factor is the same reason this site focuses on TinyMiniMicro. For home use, 1L PCs are an excellent size for a server. Using something massive is possible but unappealing, and it is extremely hard to find HHHL GPUs with any significant amount of VRAM for experimenting with ML at home. Each person’s use cases are different.

  7. I’m interested in a card like this precisely so that I can avoid needing a 3090 (or multiple) to get the need VRAM for ML inference on large models. Where the the workstation equivalent that has video output and a fan? The RTX 4000 Ada SFF comes close but still has less VRAM.

  8. @Josh – home use, home lab, all great. I do it too. Just, you will find it a lot easier to cool them in a case where you have the room to build ducts and fan mounts. If you just point a fan at them you will fail, they require real forced air.

  9. A card like this would be a slam dunk for the Thinkstation P360 Ultra. It’s bigger than TinyMiniMicro, technically, but it’s on the same class as the Z2 Mini G9 and Precision 3260. Plus, the P360 Ultra has a dedicated fan for MXM/CPU2; this means there’s a Lenovo design blower that can be adapted to this GPU!

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.