AMD EPYC 7742 Benchmarks
For this exercise, we are using our legacy Linux-Bench scripts which help us see cross-platform “least common denominator” results we have been using for years as well as several results from our updated Linux-Bench2 scripts. Starting with our 2nd Generation Intel Xeon Scalable benchmarks, we are adding a number of our workload testing features to the mix as the next evolution of our platform.
At this point, our benchmarking sessions take days to run and we are generating well over a thousand data points. We are also running workloads for software companies that want to see how their software works on the latest hardware. As a result, this is a small sample of the data we are collecting and can share publicly. Our position is always that we are happy to provide some free data but we also have services to let companies run their own workloads in our lab, such as with our DemoEval service. What we do provide is an extremely controlled environment where we know every step is exactly the same and each run is done in a real-world data center, not a test bench.
We are going to show off a few results, and highlight a number of interesting data points in this article.
Python Linux 4.4.2 Kernel Compile Benchmark
This is one of the most requested benchmarks for STH over the past few years. The task was simple, we have a standard configuration file, the Linux 4.4.2 kernel from kernel.org, and make the standard auto-generated configuration utilizing every thread in the system. We are expressing results in terms of compiles per hour to make the results easier to read:
Here we see one of the more eye-catching charts. The top-of-the-line dual-socket Intel Xeon 8280 system is fourth on this chart. That is partially due to having two sockets instead of one as well as the larger caches. One will note, those are two themes that we highlighted in our introduction.
c-ray 1.1 Performance
We have been using c-ray for our performance testing for years now. It is a ray tracing benchmark that is extremely popular to show differences in processors under multi-threaded workloads. We are going to use our 8K results which work well at this end of the performance spectrum.
Typically AMD does well here, however, we see the dual Intel Xeon Platinum 8280 setup get relatively close to the single AMD EPYC 7742. If you saw our Crushing Cinebench V5 AMD EPYC 7742 World Record Edition, this is a similar result.
7-zip Compression Performance
7-zip is a widely used compression/ decompression program that works cross-platform. We started using the program during our early days with Windows testing. It is now part of Linux-Bench.
Moving to compression and decompression we see the AMD EPYC 7642 48-core part fall significantly behind. As an aside, that is another 225W TDP part from AMD. Here the extra cores more than compensate for the additional TDP per-core clock speed boost on the 48 core part.
NAMD is a molecular modeling benchmark developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. More information on the benchmark can be found here. With GROMACS we have been working hard to support AVX-512 and AVX2 supporting AMD Zen architecture. Here are the comparison results for the legacy data set:
This is essentially code that is scaling to multiple cores but is not optimized for new instructions. A major question we get is what happens if you are using AMD EPYC with existing workloads. We generally advise that if you have a legacy application running in a VM you can move it to an EPYC host, and you will not notice a performance hit. Instead, you are likely to have a bigger pool of resources to work with.
OpenSSL is widely used to secure communications between servers. This is an important protocol in many server stacks. We first look at our sign tests:
Here are the verify results:
Here we added quad Intel Xeon Gold 6242 CPUs. 64 high-frequency Intel Xeon cores, with 24 DDR4 channels, and 600W of combined TDP are able to beat the AMD EPYC 7742 here. One can see the relative level that the quad CPU solution wins by, and we will let our readers judge if that is worth the trade-off. This is a good example of what we mean by Intel needs four socketed CPUs to equal 64 cores per socket. Frankly, this is one of Intel’s strongest plays versus the single socket AMD EPYC. It has the same number of cores and the same total memory capacity. PCIe lanes are higher but AMD still has the PCIe bandwidth advantage even with the single CPU.
UnixBench Dhrystone 2 and Whetstone Benchmarks
Some of the longest-running tests at STH are the venerable UnixBench 5.1.3 Dhrystone 2 and Whetstone results. They are certainly aging, however, we constantly get requests for them, and many angry notes when we leave them out. UnixBench is widely used so we are including it in this data set. Here are the Dhrystone 2 results:
Here are the whetstone results:
Here the dual Intel Xeon Platinum 8280 registers a split with the AMD EPYC 7742. That is still an impressive result for the single EPYC.
Chess is an interesting use case since it has almost unlimited complexity. Over the years, we have received a number of requests to bring back chess benchmarking. We have been profiling systems and are ready to start sharing results:
Here we have a very close EPYC 7742 and dual Platinum 8280 result. One of the reasons that the Platinum 8280 seems further behind is that it is a relatively muted upgrade over the previous-generation Platinum 8180. The mainstream Xeon Gold segment saw a big bump in core counts with this generation, but the maximum core count stayed the same at 28 cores.
STH STFB KVM Virtualization Testing
One of the other workloads we wanted to share is from one of our DemoEval customers. We have permission to publish the results, but the application itself being tested is closed source. This is a KVM virtualization-based workload where our client is testing how many VMs it can have online at a given time while completing work under the target SLA. Each VM is a self-contained worker.
This is a bit hard to read with this many results, but we wanted to add the dual-socket results here. One can see that the single NUMA node AMD EPYC 7742 with more cores is doing better throughout the range than the dual Platinum 8280 setup. One can also see the dual EPYC 7742 setup is in another league versus dual Xeon Platinum 8280s.
AMD simply has an awesome platform for virtualization as one can more effectively utilize bigger pools of RAM and cores per NUMA node. Many of the HCI vendors such as Dell EMC with VMware vSAN have been focusing efforts on the AMD EPYC 7002 series.
Next, we are going to look at the AMD EPYC 7742 market positioning before moving to our final words.