AMD Ryzen Threadripper 3960X Review 24 Cores of Impressive

10
AMD Ryzen Threadripper 3960X Cover
AMD Ryzen Threadripper 3960X Cover

The AMD Ryzen Threadripper 3960X is nothing short of impressive. This is a 24 core, 48 thread processor with a 3.8GHz base clock and 128MB of L3 cache. That is similar to, if not better than many of the dual Intel Xeon E5 V3 and V4 workstations currently being used. With the new 3rd generation Threadripper, AMD has introduced a number of features that Intel still cannot match. In our review, we are going to take a look at the performance relative to other workstation options in the market today.

Key stats for the AMD Ryzen Threadripper 3960X: 24 cores / 48 threads with a 3.8GHz base clock and 4.5GHz turbo boost. There is 128MB of L3 cache. The CPU features a 280W TDP. These are $1399 list price parts.

Here is what the lscpu output looks like for an AMD Ryzen Threadripper 3960X:

AMD Threadripper 3960X Lscpu Output
AMD Threadripper 3960X Lscpu Output

AMD is claiming 140MB of cache but it is important to remember this is really L2 + L3 cache. Still, if you compare the 128MB of L3 cache here in 8x 16MB segments, you get vastly more cache than top-end Intel SKUs like the Intel Xeon W-3275 28-core halo product which has only 38.5MB of L3 cache.

AMD Ryzen Threadripper 3960X In Partial Package
AMD Ryzen Threadripper 3960X In Partial Package

Since the 3rd generation, Ryzen Threadripper is using the AMD EPYC 7002 series “Rome” package as a base, it has features such as PCIe Gen4 and DDR4-3200 support. With the 3rd gen Threadripper platform, AMD has features such as PCIe Gen4 that Intel simply cannot match even if the Xeon W-3275 can match its core count.

AMD TRX40 Platform

With the 3rd generation AMD Ryzen Threadripper family we get a new TRX40 platform. The TRX40 brings with it PCIe Gen4. That is a feature Intel lacks in this generation. The CPU to TRX40 interface has gone from a Gen3 x4 link to a Gen4 x8 link effectively quadrupling bandwidth to the chipset.

AMD TRX40 Platform
AMD TRX40 Platform

Realistically, while the platform’s quad-channel memory is more similar to Intel’s X299 chipset, the I/O capabilities are more like an upgraded version of the Xeon W-3200 series platforms like we saw in ourĀ Supermicro X11SPA-T motherboard review. PCIe Gen4 gives AMD a higher I/O bandwidth platform while the LGA3647 Intel chipset has additional memory channels and capacity.

Many commented on our previous articles, in our forums, and on the Internet, lamenting that the 3rd Generation Threadripper family needed new motherboards. Two points to address this concern. First, PCIe Gen4 requires higher-quality PCB materials, and that makes the transition a logical point to upgrade platforms. Second, the volume in this market buys a PC for office work, then upgrades it on an IT refresh cadence. They are not swapping CPUs into old systems. Given the choice between backward compatibility and game-changing new features, we take new features and moving the market forward.

Major Topology Overhaul

First and second-generation Threadripper chips were known for having multiple NUMA nodes, much like dual processor Intel Xeon systems. Some chips, such as the AMD Ryzen Threadripper 2990WX had four NUMA nodes like the AMD EPYC 7001 “Naples” generation, as a four die/ NUMA node design. As you can see, the 2990WX has four NUMA nodes but only two have direct access to memory while the other two have to hop over Infinity Fabric to memory attached to a different die. This was less than ideal.

AMD Ryzen Threadripper 2990WX Topology
AMD Ryzen Threadripper 2990WX Topology

This topology worked, however, it probably would have been better if each die had access to a single memory channel in a 1+1+1+1 rather than a 2+0+2+0 quad-channel configuration. Some things were less than straightforward with this topology.

With the new AMD Ryzen Threadripper 3960X, we see a more AMD EPYC 7002 “Rome” series-like topology. You can compare the below to our AMD EPYC 7402P Review as an example.

AMD Ryzen Threadripper 3960X Topology
AMD Ryzen Threadripper 3960X Topology

With the new I/O die configuration, more or less taken from the EPYC side, one gets four DDR4 channels that connect to the I/O die. The I/O die also has PCIe lanes and the x86 core dies attached to it. As a result, we get something that most OSes see as a single NUMA node.

For those with 12-16 core per CPU Intel Xeon workstations, AMD is essentially halving the number of NUMA nodes you need for a similar system.

Test Configuration

Here is the test configuration we used for the Ryzen Threadripper 3960X:

As a quick note here. The retail packaging comes with a case badge which is nice, but there are two more important bits. First, one gets a torque driver that helps one secure the chip into the socket. Second, one gets a water-cooling adapter ring.

AMD Ryzen Threadripper 3970X And 3960X Additional Box Contents
AMD Ryzen Threadripper 3970X And 3960X Additional Box Contents

The new 3rd Generation AMD Ryzen Threadripper family shares a lot with the AMD EPYC so if you use the Threadripper tool it will work on EPYC sockets as well. While the sockets are different, the physical latching mechanism is very similar.

AMD Ryzen Threadripper 3970X Top And 3960X Bottom
AMD Ryzen Threadripper 3960X Underside And 3970X Top

For our CPU we will be using an AMD Ryzen Threadripper 3960X (24 core/48 thread) that you can see in the CPU-Z shot here:

AMD 3960X CPUz
AMD 3960X CPUz

The AMD Ryzen Threadripper 3960X is a very capable CPU, with turbo speeds that can reach up to 4.5GHz.

Let us continue with Windows performance testing.

10 COMMENTS

  1. I’m hoping they’ll make RDIMMs available for the higher core count Threadrippers. I believe that the 32GB UDIMMs max out at 2666 mHz, so if you want 3200 you’re limited to 128GB. That’s not a lot to feed 128 vcores.

  2. Ryan, have you encountered SCALEMP?

    They are the OEM of Intel Optane RAM impersonation drivers.

    Grab the most consistently low latency drive you can and go.

  3. Would be great if you can test some of W-22xx family too. I’m most curious if their 165W TDP is a real thing as is in case of TR (which is on 280W these days) or this is just Intel and things go way up over that limit too like i9.

  4. Fully vetted/certified ECC(All Kinds) support is maybe not going to be provided for TR/Consumer parts by AMD/Motherboard makers and most of the Pro Software packages that really need either Epyc/Xeon branded parts.

    If AMD creates a Pro Threadripper/MB True Workstation Branding it’s going to have to cost similar to the 7H02 series parts and AMD’s EEE division that’s currently over Epyc/Professional systems will not want any product segment cannibalization. That said if the TR 3000 series 48 and 64 core parts were limited to say 6 memory channels max and some more limited PCIe 4.0 lane counts compared to Epyc/SP3 then maybe that could be offered, but that’s without the full ECC memory types support where AMD/MB vendors spend extra on the proper CPU/MB certification/vetting process that does not come cheaply.

    In not very many more business quarters AMD’s Epyc CPU/Pro GPU Accelerator sales will begin the process of dwarfing AMD’s consumer divisions in the revenue category and AMD’s golden cow will most certainly come from Professional Compute/AI and Server/HPC market revenues where the margins are there and most will pay and write that expense down on their taxes. AMD will have to be more like Intel in that regard in order for AMD to get the revenue stream going to compete with Intel longer term. So AMD has to push its gross margins ever higher or Intel will eat them on the R&D investment side.

    It’s just too unfeasible for AMD/Pro MB partners to pack the full Professional feature sets into any consumer branded/priced parts and lose that needed gross margin and revenue growth that’s necessary to compete with the giant Intel empire. AMD maybe has another 2 years at most to get its market cap and revenues high enough to fend off Intel after that time frame expires and Intel’s proper reply has been fielded. AMD really needs to remain as active as a garden shrew on the R&D side of things and that’s going to need higher margins to fund.

  5. @DiscoShrewzRevenuez
    With its current product lineup, AMD is leaving a couple of gaps open, for Intel to have SKUs that have no direct competitor.
    The first is the strange decision to not have a 16 core TR 3000 part. AMD claims that TR demand is top heavy, but still. This leaves open a gap for the 14 and 18 core X299 parts to fit in, for those who need more memory capacity, more memory bandwidth or more PCI-E lanes than what AM4 can provide.

    The second, IMO more significant oversight is not competing with the Xeon-W lineup, which brings both high clockspeeds and tons of ECC RAM to the table, *at the same time*.
    EPYC cannot compete with Xeon-W, because there are no frequency-optimized Rome parts.
    TR cannot compete because it does not support ECC (L)RDIMMs.

    Maybe the rumored TRX80 platform will be the Xeon-W competitor? With 8 channels of (L)RDIMMs?

  6. @Anon: I somewhat agree about your first point – I got a TR1920X as a cheap entry into a terrific platform (little did I know that it will taken behind the barn and shot at first chance). But it’s much less important than your second point, that I totally agree with.

    Even if they put out a higher tier TR with 8 channel RDIMM support as is rumored, they are in a tight spot. It’s really disqueting to see how quickly AMD adopts Intel’s artificial segmentation behavior once they are in the lead. In my opinion, the current TR3 platform doesn’t offer substantial advantages over the first TR platform but the price is huge (at least in terms of mindshare) – breaking compatibility. It should have had 96 PCIe lanes and 4 channel RDIMM support. Then it would have covered much wider set of use cases and would have provided real justification for the compatibility break. Now it’s just “meh” (with a lousy fan on top – I’m talking about the chipset obviously) while leaving a big market gap open.

    BTW, I guess it’s not too late for AMD to offer frequency optimized Epyc. That would seal the gap from above.

  7. Would be great if you could run the STH Linux benchmarks, especially the Kernel Builds per Hour.

    Here we’re running lots of cross compilings to build embedded Kernels: Buildroot, Xilinx toolchain, Android toolchain, among others, make intensive use of X86 processors. Android build can last 3 hours on the Skylake family processors, even with a SSD. What can we expect from Threadripper ?

  8. I really was hoping to hear more about the WRX80 Chipset for the TR3 by now. I would love to see a more Workstation oriented Chipset/Motherboard for the TR3. A Motherboard with express support for ECC Ram and also a few more PCIe slots and supports all TR3 chips.

LEAVE A REPLY

Please enter your comment!
Please enter your name here