Today, NVIDIA did something that borders on crazy beyond dropping its dev kit price by half to $249. As of the newest Jetpack software release, the NVIDIA Jetson Nano can have up to twice the memory bandwidth and 57-70% more INT8 TOPS. Typically, adding these features would be a solid upgrade before the next-gen Jetson Thor generation, which had been slated for Q1 2025. The eyebrow-raising part of this is that the upgrade comes even to existing platforms.
NVIDIA Jetson Nano Gets a HUGE Upgrade to Super
Putting the first interesting stat out there, NVIDIA Jetson is getting a 57-70% performance increase in INT8 TOPS across the range from the Jetson Orin Nano 4GB to the Jetson Orin NX 16GB.
The “new” NVIDIA Jetson Orin Nano Super Developer Kit offers a 50% increase in memory bandwidth to 102GB/s compared with its predecessor. We asked, and were told there are no hardware changes. So somehow, in a 25W envelope, using the same hardware, we get double the memory bandwidth and more raw compute performance.
You can see our video if you want to learn more about the NVIDIA Jetson Nano development platform and the AGX platform.
This is perhaps one of the segments that NVIDIA is underappreciated in. This was the precursor to the NVIDIA GB200 that is so hot right now with an Arm CPU and NVIDIA GPU.
Final Words
On the one hand, this is awesome. Doubling memory bandwidth and bumping performance by a huge amount even on existing platforms, just with a Jetpack SDK update, feels, well, super. At the same time, reasonable folks will ask if the hardware was capable of this performance in the first place, why not unlock it from theĀ NVIDIA Jetson Orin Nano Launch in 2022 or theĀ NVIDIA Jetson Orin Nano Developer Kit Launch in 2023? Have these devices been out in the field with the hardware waiting for a firmware switch to unlock massive performance gains? It just feels strange. Still, it is awesome that we do not have to re-buy modules to get the new performance.
With LLMs getting so much better, the next step is to bring that technology to more robotics applciations. NVIDIA’s next-generation in this space is set to drop the NVIDIA Deep Learning Accelerator, or DLA fixed-function acceleration and adopt more of the tensor core compute we see on NVIDIA’s data center GPUs. The hope is that the NVIDIA Jetson Developer Kits get folks building robotics on NVIDIA platforms as that space feels like it is getting closer to exploding in popularity. Increasing the performance and halving the price on the current generation feels like a good step in that direction.
They may have been worried about the power or heat envelope before unlocking.
It’s a bit interesting how comparatively understated Nvidia has been about this hardware in the past.
Not so much because they necessarily think it will make them a zillion bucks or be a competitive use of their allocation of bleeding-edge TSMC vs. datacenter parts; but as a defensive move.
For ‘edge’ stuff that is small enough that you don’t just mean ‘short depth server that doesn’t require air conditioning and has a locking case’; the seemingly obvious, though likely inferior for ‘AI’ specifically, option is either basically any generic ARM SoC, or a smartphone-oriented SoC(if you need the cell modem anyway).
The latter, in particular, have had some pretty punchy DSP/ISP to support cameras for some years now; and they’ve been rapidly re-tooling that into ‘NPU’/’AI’ functions. they aren’t necessarily better than what Nvidia has; but unless they are abjectly unsuitable for purpose they are likely to spawn a bunch of ‘edge AI’ applications that don’t rely on CUDA or other Nvidia tools; and the history of computing is full of small, cheap, mediocre solutions that gradually moved upmarket and nibbled away at the far classier and more expensive competition.
I assume that Nvidia is comfortably ahead of say Mediatek or Qualcomm; but they can’t be entirely happy about ‘edge’ stuff getting developed for Helio or Dimensity SoCs just because those are about the cheapest way to get some ARM cores and a MIPI CSI camera onto a cell network potentially migrating upmarket if Mediatek feels like copy/pasting their NPU into a larger PCIe card; or Qualcomm feels like emphasizing that targeting their NPU gives you a shared target from relatively cheap cellular SoCs up through their entries into the PC space.