Dual NVIDIA GeForce RTX 3090 NVLink Performance Review

13

Testing the NVIDIA GeForce RTX 3090

Here is our test configuration:

Let us move on and start our testing with computing-related benchmarks.

NVIDIA RTX 3090 SLI Compute Benchmarks

We are going to show a number of our GPU compute workloads that take advantage of NVLink and multiple GPUs. In the interest of brevity, if you do not see a normal section, then that is something we found to be not multi-GPU compute aware. Specifically, we did not get multi-GPU scaling in our Geekbench, SPECviewperf13, and 3D Mark Unigine so we are going to save our readers a page of flipping through results.

LuxMark

LuxMark is an OpenCL benchmark tool based on LuxRender.

NVIDIA RTX 3090 NVLink LuxMark
NVIDIA RTX 3090 NVLink LuxMark

In LuxMark, a single NVIDIA RTX 3090 was very close to equaling our previous NVLink configurations of past generation GPUs. Here, our dual NVIDIA GeForce RTX 3090 setup achieved amazing results.

AIDA64 GPGPU

These benchmarks are designed to measure GPGPU computing performance via different OpenCL workloads.

  • Single-Precision FLOPS: Measures the classic MAD (Multiply-Addition) performance of the GPU, otherwise known as FLOPS (Floating-Point Operations Per Second), with single-precision (32-bit, “float”) floating-point data.
  • Double-Precision FLOPS: Measures the classic MAD (Multiply-Addition) performance of the GPU, otherwise known as FLOPS (Floating-Point Operations Per Second), with double-precision (64-bit, “double”) floating-point data.
NVIDIA RTX 3090 NVLink AIDA64 GPGPU Part 1
NVIDIA RTX 3090 NVLink AIDA64 GPGPU Part 1

The next set of benchmarks from AIDA64 are:

  • 24-bit Integer IOPS: Measures the classic MAD (Multiply-Addition) performance of the GPU, otherwise known as IOPS (Integer Operations Per Second), with 24-bit integer (“int24”) data. This particular data type defined in OpenCL on the basis that many GPUs are capable of executing int24 operations via their floating-point units.
  • 32-bit Integer IOPS: Measures the classic MAD (Multiply-Addition) performance of the GPU, otherwise known as IOPS (Integer Operations Per Second), with 32-bit integer (“int”) data.
  • 64-bit Integer IOPS: Measures the classic MAD (Multiply-Addition) performance of the GPU, otherwise known as IOPS (Integer Operations Per Second), with 64-bit integer (“long”) data. Most GPUs do not have dedicated execution resources for 64-bit integer operations, so instead, they emulate the 64-bit integer operations via existing 32-bit integer execution units.
NVIDIA RTX 3090 NVLink AIDA64 GPGPU Part 2
NVIDIA RTX 3090 NVLink AIDA64 GPGPU Part 2

With AIDA64 GPGPU, the GeForce RTX 3090 NVLink powers through this benchmark and generates dominating performance numbers.

hashcat64

hashcat64 is a password-cracking benchmark that can run an impressive number of different algorithms. We used the windows version and a simple command of hashcat64 -b. Out of these results, we used five results in the graph. Users who are interested in hashcat can find the download here.

NVIDIA RTX 3090 NVLink Hashcat
NVIDIA RTX 3090 NVLink Hashcat

Hashcat can put a heavy load on GPUs, and here we see the dual-fan graphics cards have the edge in our results. However, with the cooling system used on our NVIDIA RTX 3090’s, Hashcat heat loads are easily handled with Tri-Cooling fan setups.

Let us move on and start our new tests with rendering-related benchmarks.

13 COMMENTS

  1. DirectX 12 supports multi GPU but has to be enabled by the developers

    NVlink was only available on the 2080 Turing cards – so only the high end SKU having it – nothing new. AMD’s solution is what again? Nothing.

    in DX11 games – dual 2080Ti were a viable 4K 120fps setup – which I ran until I replaced them with a single 3090. 4K 144Hz all day in DX11.

    I would imagine someone will put out a hack that fools the system into enabling 2 cards – even if not expressly enabled by the devs

    2 different cards is about as ghetto as it gets and shows the (sub)standards of this site – Patrick’s AMD fanboyism is the hindrance to this site – used to check every day – but now check once a week – and still little new… even the jankiest of yootoob talking heads gets hardware to review.

  2. As an aside, I hope ya’ll get a 3060 or 3080 TI to review.

    The possibility of the crypto throttler affecting other compute workloads has me very worried… and STH’s testing is very compute focused.

  3. Good review Will, ignore the fanboy whimpers any regulars knows how false his claims are.
    Next up A6000?
    Curious how close the 3090 is.

  4. Thanks for the review. It would be awesome to see how much the NVLink matters. I’m particularly interested for ML training – does the extra bandwidth help significantly, v.s. going through PCIe?

  5. One huge issue is the pricing.

    Many see the potential ML / DL Applications of the 3080 and their first idea is to stick them in Servers for professional use. The issue with that is that, in theory, this is a datacenter use of the GPU and thus violates the Nvidia Terms of Use…

    AFAIK only Supermicro sells Servers equipped with the RTX 3080… why they are allowed to do that ? IDK… considering it is supermicro, they might just not care.

    Here comes the pricing issue though. If you are offering your customers the bigger brands such as HPE and Dell EMC you are stuck with equipping your Servers with the high end datacenter GPUs such the V100S or A100 which cost 6-8 times as much as a RTX 3080 with similar ML perfomance … on paper.

    Nvidia seems to be shooting themselfes in the foot with this. In addition to making my job annoying trying to convince customers that putting a RTX 3080 into their towers should be considered a bad idea.

  6. I’ve got exactly the same 2 cards!
    What specific riser did you use? I’d like to hear your recommendation before I purchase something random ;).

  7. I have two 3090, same brand and connected with the original NVLink.
    We acquired these for a heavy weight VR application done with Unreal Engine 4.26
    We tested all the possible combinations but we couldn’t make them work together in VR. Only one GPU is taking the app. We checked with the Epic guys and they don’t have a clue. We contacted Nvidia technical support and the guys of the call center literally don’t have any page to use it against this extreme configuration We want to use one eye per GPU but it is not working. Anyone has an idea or knows something. Any help is more than welcome !!!!!

  8. One of the problems I have run into with multiple Cards is that they do not seem to increase the overall GPU memory available. I have configurations where there are 2-4 cards in the computers and when I run applications, they only seem to think that I have 12 GB of GPU memory only. Even when 2 are NVLinked. I see the processes spread out amongst the cards, but for large data files, I see that my GPU footprint increase to around 11.5 – 11.7 GB and things slow down when this happens. Thus, GPU memory seems to be the bottle neck that I have been running into (12 GB on the 3080ti and the 2080ti).

  9. While getting cards has been a little difficult, it isn’t that hard to source a pair of the same cards. I currently have 3 x rtx3090 ftw3 ultra cards and 1 3090 from an Alienware.
    I learned long ago while running a pair of gtx1080ti’s, very few dev’s utilized the necessary products to benefit from SLI. One card just sat silently while the other worked. Perhaps they’ve improved. Only time will tell.

  10. I have Asus Strix 3090s (x2) and with NVLink Bridge (4Slot) cant get Nvidia control panel to see that they are connected, no option to enable SLI/NVlink. using latest driver 512.59

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.