Ever since the AMD EPYC 7001 series and Ryzen Threadripper launched in 2017, the big question has been when will AMD finally get into the professional workstation market. While there are many consumer systems that aspire to be workstations as core counts have risen, the workstation market is largely dominated by Lenovo, HP, and Dell. These three vendors have workstations that often are rackmount convertible and are much closer to servers with GPUs than they are traditional desktops. For over a decade, this market has belonged to Intel and today, we are taking a look at the top-end processor powering the first non-Xeon professional workstation system from one of those vendors in a long time. This is our AMD Ryzen Threadripper PRO 3995WX review.
AMD Ryzen Threadripper PRO 3995WX Overview
As you may expect, since we internally use the term “WEPYC” (short for Workstation EPYC) for the Threadripper PRO line, Patrick did a video on this one as well. These pieces were a team effort with William working on the upcoming Lenovo ThinkStation P620 review and Patrick doing the video.
As always, we suggest opening the video in a new browser for a better viewing experience.
Key stats for the Threadripper PRO 3995X are a 64-core/ 128-thread processor with 256MB of L3 cache. We get a 2.7GHz base and a 4.2GHz boost clock. This is a 280W TDP part. What is perhaps most exciting is that we also get 8-channel DDR4 memory support as well as RDIMM/ 3DS/ LRDIMM support (if enabled by the platform.) Here is the lscpu output for the part:
As a quick note, some documentation calls this the “AMD Threadripper Pro” series with out the Ryzen label. We are using the AMD Ryzen Threadripper PRO 3995WX since that is printed on the CPU and is in the CPU ID model that the chip reports.
Something that we will quickly note here is that although the official specs say 4.2GHz, we did catch a number of screenshots like the below.
Aside from the 8-channel memory, we also get a full PCIe lane configuration although some lanes are used for the chipset that provides workstation I/O connectivity via a PCIe Gen4 link to the CPU. As an example, if we want high-speed USB, we need an external chipset to enable this which is why we do not see a lot of USB 3.2 Gen2 ports it on AMD EPYC servers.
This is the highest-end processor in the segment. AMD also has 12, 16, and 32 core parts. These are more comparable to single-socket Intel Xeon offerings today, however, we only have the higher-end part to test.
Overall, we basically know this configuration. It is effectively a 64 core AMD EPYC “P” series (single-socket) part with clock speed similar to the AMD Ryzen Threadripper 3990X along with a 280W TDP as we also saw with the AMD EPYC 7H12. Hence why we nicknamed it the WEPYC.
Test Configuration: Lenovo ThinkStation P620
Here is the configuration we are using for the system:
- System: Lenovo ThinkStation P620
- CPU: AMD Ryzen Threadripper Pro 3995WX
- RAM: 8x 32GB DDR4-3200 (256GB Total) RDIMMs (2x 16GB as configured by Lenovo)
- GPU: NVIDIA Quadro RTX 6000
- NIC: Onboard 10GbE + NVIDIA ConnectX-6 200GbE PCIe Gen4 single-port add-in
- Storage: Western Digital SN720
- OS SSD: Intel DC P3710 400GB
The Lenovo ThinkStation P620 borrows its design from the P520 designed for chips like the Intel Xeon W-2295. The big difference is that we get AMD EPYC support which means we get PCIe Gen4 in a workstation before Intel is offering the feature.
There are a few nice touches in the system. For example, flanking the CPU on either side are two sets of 4x DIMM slots. Lenovo has actively cooled covers on these DIMM slots to ensure that the DDR4-3200 memory stays cool. At the same time, we do not normally see this attention to cooling in EPYC servers so it looks a bit different versus what we were expecting.
Lenovo also has a massive heatsink for the CPU with two fans and a number of heat pipes. Cooling a 280W CPU can be done on air, but keeping the system relatively quiet while also dealing with the practical limitation of chassis height is impressive.
In our test configuration we had a single NVIDIA Quadro RTX 6000, but one can add a second dual-slot GPU as well. We did not have a matching Quadro RTX 6000, so we used a 200GbE NIC. William has our full review of the Lenovo ThinkStation P620 coming, but you can also see more views in the video linked above.
Next, we are going to look at performance and power consumption before getting to our final words.
Got mine since Oct end and been using it with 512gb of micron ram. Easily beating a 2P E5 2670v3 by a factor of 5 to 10 on my compute task.
Small form factor and hardly any noise or heat that I can keep under my desk. Have no complaints about it
Any comments on the recently sighted Asus and Gigabyte WRX80 motherboards?
The Threadripper Pro 3995WX is arguably AMD’s fastest multicore chip by being clearly higher clocked than the Epyc 7H12 while its eight memory channels can give it the edge over the vanilla Threadripper 3990X. We’re probably nearly Threadripper 5000 series with CES looming which would leaving Threadripper Pro probably arriving a few months later, after Epyc Milan is formally unveiled (Feb?). I would imagine that OEMs are waiting on the next generation before widely adopting this platform.
I do wish the Threadripper Pro offered some form of overclocking support. While stability is important, many of the tasks that wall into the workstation segment (CPU based video editing, rendering) can weigh that trade off. Even tuning where the base clock is increased but the turbo is lowered would be a viable trade off. This keeps clocks within specification but power draw certainly would not be. Similarly system cooling has to keep up with the added power draw. Though for many use-cases that is a viable trade off. Another area where overclocking would pay off for some users is with memory and being able to climb to DDR4-3600. This not only improves memory bandwidth but also several of the on-die buses run off of this clock for improved performance. Moving to DDR4-4000 invokes a bus ratio change so while raw memory bandwidth increases, overall performance can actually decrease due to lowered clocks else where. I’d be curious if Asus or Gigabyte adopt Threadripper Pro if they’ll support higher memory clocks but it does make sense that an OEM like Lenovo would strictly stick to official specs.
Lastly one feature I was hoping AMD would enable would be raw Infinity Fabric support over PCIe slots to various Radeon graphics cards. AMD has started to leverage Infinity Fabric links on their highend GPUs (Vega 20, MI100). This would not only provide more bandwidth between the CPU and GPU for compute focused workloads but fully coherent memory addressability between the two. That is a huge latency benefit which has traditionally been a bottleneck.
With regards to the P620, I do wish more of the PCIe lanes were put to use in the system. The Threadripper Pro platform with the TRX80 chipset has 136 open lanes available (120 from the CPU, 16 from the chipset). All the slots should be 16x as there are lanes to spare in addition to those used by storage. Various peripherals like audio can hang off of the chipset at lower PCIe lane widths and data rates without compromising their functionality. A single Ethernet connector is surprising even if it supports 10 Gbit. Several workstation use-cases I’ve dealt with have them sit on two separated networks.
Kevin G: the chip overclocks itself. It’s called turbo mode. And since this is the highest configuration, I also think AMD put there the highest capable chip. No need ever, ever to tweak it. AMD did that for you already.
Curious to see if ASRock pump out a ‘server’ board for this too.