AMD EPYC 7501 Benchmarks and Review 32 Cores Per Socket

13

AMD EPYC 7501 Benchmarks

For this exercise, we are using our legacy Linux-Bench scripts which help us see cross-platform “least common denominator” results we have been using for years as well as several results from our updated Linux-Bench2 scripts. At this point, our benchmarking sessions take days to run and we are generating well over a thousand data points. We are also running workloads for software companies that want to see how their software works on the latest hardware. As a result, this is a small sample of the data we are collecting and can share publicly. Our position is always that we are happy to provide some free data but we also have services to let companies run their own workloads in our lab, such as with our DemoEval service. What we do provide is an extremely controlled environment where we know every step is exactly the same and each run is done in a real-world data center, not a test bench.

Python Linux 4.4.2 Kernel Compile Benchmark

This is one of the most requested benchmarks for STH over the past few years. The task was simple, we have a standard configuration file, the Linux 4.4.2 kernel from kernel.org, and make the standard auto-generated configuration utilizing every thread in the system. We are expressing results in terms of compiles per hour to make the results easier to read.

AMD EPYC 7501 Linux Kernel Compile Benchmark
AMD EPYC 7501 Linux Kernel Compile Benchmark

Having a massive number of cores and 8-channel DDR4-2666 memory bandwidth yields impressive performance. The Intel Xeon Gold 6152 is in the same price band but with only 22 cores, even with higher clocks, it cannot handle the 32-core AMD EPYC 7501.

c-ray 1.1 Performance

We have been using c-ray for our performance testing for years now. It is a ray tracing benchmark that is extremely popular to show differences in processors under multi-threaded workloads. We are going to use our new Linux-Bench2 8K render since it teases out more differences in this CPU segment than our older 4K results.

AMD EPYC 7501 C Ray 8K Benchmark
AMD EPYC 7501 C Ray 8K Benchmark

AMD EPYC has great cache performance and plenty of cores. As a result, the entire line performs well on our 8K render test. Here there is an appreciable gain by moving from the AMD EPYC 7451 to the AMD EPYC 7501 with 8 more cores.

7-zip Compression Performance

7-zip is a widely used compression/ decompression program that works cross-platform. We started using the program during our early days with Windows testing. It is now part of Linux-Bench.

Midrange Single Socket 7zip Benchmark Results Xeon Gold And AMD EPYC
Midrange Single Socket 7zip Benchmark Results Xeon Gold And AMD EPYC

Again we see generally great performance. For those looking at a dual AMD EPYC 7281 setup compared to a 32 core AMD EPYC 7501, fewer NUMA nodes (4 v 8) has an appreciable impact on performance. Part of AMD EPYC’s allure is having lots of cores per socket.

NAMD Performance

NAMD is a molecular modeling benchmark developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. More information on the benchmark can be found here. We are going to augment this with GROMACS in the next-generation Linux-Bench in the near future. With GROMACS we have been working hard to support Intel’s Skylake AVX-512 and AVX2 supporting AMD Zen architecture. Here are the comparison results for the legacy data set:

AMD EPYC 7501 NAMD Benchmark
AMD EPYC 7501 NAMD Benchmark

AMD EPYC floating point performance is generally very good when AVX-512 is not involved. Here we see the AMD EPYC 7501 perform well simply due to its 32 cores.

Sysbench CPU test

Sysbench is another one of those widely used Linux benchmarks. We specifically are using the CPU test, not the OLTP test that we use for some storage testing.

AMD EPYC 7501 Sysbench CPU Benchmark
AMD EPYC 7501 Sysbench CPU Benchmark

On our Sysbench CPU test, the Intel Xeon Gold 6152 pulls out a slight victory in this test, but by a slim margin.

OpenSSL Performance

OpenSSL is widely used to secure communications between servers. This is an important protocol in many server stacks. We first look at our sign tests:

AMD EPYC 7501 OpenSSL Sign Benchmark
AMD EPYC 7501 OpenSSL Sign Benchmark

And the verify results:

AMD EPYC 7501 OpenSSL Verify Benchmark
AMD EPYC 7501 OpenSSL Verify Benchmark

Here we see solid results from the AMD EPYC 7501. OpenSSL is a foundational element in today’s infrastructure so it is great to see a solid performance here.

UnixBench Dhrystone 2 and Whetstone Benchmarks

Some of the longest-running tests at STH are the venerable UnixBench 5.1.3 Dhrystone 2 and Whetstone results. They are certainly aging, however, we constantly get requests for them, and many angry notes when we leave them out. UnixBench is widely used so we are including it in this data set. Here are the Dhrystone 2 results:

Midrange Single Socket Unixbench Dhrystone 2 Benchmark Xeon Gold And AMD EPYC
Midrange Single Socket Unixbench Dhrystone 2 Benchmark Xeon Gold And AMD EPYC

And the whetstone results:

Midrange Single Socket Unixbench Whetstone Benchmark Xeon Gold And AMD EPYC
Midrange Single Socket Unixbench Whetstone Benchmark Xeon Gold And AMD EPYC

EPYC generally does well here and the EPYC 7501 is no exception. Some of the Intel models are scoring big on single threaded performance in these charts, although it is hard to see given the scale. AMD EPYC does well when it can use all of its cores.

GROMACS STH Small AVX2/ AVX-512 Enabled

We have a small GROMACS molecule simulation we previewed in the first AMD EPYC 7601 Linux benchmarks piece. In Linux-Bench2 we are using a “small” test for single and dual socket capable machines. Our medium test is more appropriate for higher-end dual and quad socket machines. Our GROMACS test will use the AVX-512 and AVX2 extensions if available.

AMD EPYC 7501 GROMACS STH Small Benchmark
AMD EPYC 7501 GROMACS STH Small Benchmark

AVX-512 helps dual port FMA Intel CPUs (Gold 6100 and Platinum 8100) perform extremely well here. On the other hand, Gold 5100 and Silver 4116 CPUs, even with AVX-512 cannot match the performance of the AMD EPYC CPUs. Here AMD EPYC is performing well but this is one where the dual port FMA Xeon AVX-512 is a big advantage.

Chess Benchmarking

Chess is an interesting use case since it has almost unlimited complexity. Over the years, we have received a number of requests to bring back chess benchmarking. We have been profiling systems and are ready to start sharing results:

AMD EPYC 7501 Chess Benchmark
AMD EPYC 7501 Chess Benchmark

Again, simply having more cores is helpful here. The AMD EPYC 7501 is not quite able to match the other two 32-core AMD EPYC CPUs, but it is also the least expensive. One can see the product positioning that AMD is doing well here.

Next, we are going to look at power consumption of the AMD EPYC 7501, market positioning, and some of our final thoughts.

13 COMMENTS

  1. “Here AMD EPYC is performing well but this is one where the dual port FMA Xeon AVX-512 is a big advantage.”

    NAMD Performance on Xeon-Scalable 8180 and 8 GTX 1080Ti GPUs on Pugetsystems.

    I don’t know how much Servethehome gets from parties to mention the pro’s of AVX-512, compared with a GPU it is totally useless.

  2. Is there any chance of an ffmpeg encode benchmark to be added in the future? It would be very interesting to see how AMD compares against Intel given that video encoding/streaming is such a key workload that demands large scale CPU capacity. I’m more than happy to work together in setting the parameters for such a benchmark and supply quality media for the best comparable results.

  3. Are they benchmarks for all of the processors available somewhere? I’m particularly interested in the NAMD and Gromacs results.

    Thanks!

  4. “Even the ARM vendors we work with acknowledge AVX-512 is getting a lot of attention.”

    I would do the same when I couldn’t get my hands on GPU IP.
    ($20k)1x 8180 + 1x Tesla V100 has at least the same speed with these kind of calculations (fp32 and fp64) as ($50k)5x 8180. With optimized software the difference will be even bigger.

  5. @patrick @david I use this as a media encoding benchmark for a site I write for. https://nwgat.ninja/thefireescape/ It’s built on FFMPEG and supports both an x264 and x265 export, the included scripts run it for 5 runs and I have seen variances thanks to the presence of AVX on newer cpu’s

  6. Daniel, thank you for the input. We have run that. It is not something we can use due to it being Windows-based and the fact it does not scale well. We have a set of criteria a benchmark must meet and that was hitting scaling limits by only 12 cores. For consumer CPUs, it may be useful. It can potentially be useful for multiple streams if logic is built to do QoS in transcoding spinning up multiple containers. Sadly, it is far from making the cut on what we can use at this point. I do want a x265 benchmark but am still looking for a good one.

  7. Ahh I didn’t know how hard it would be to move to linux, shotcut itself is foss and is built for linux as well although that particular benchmark was built for windows I thought the python logic would be portable. That said it seems to scale well past 12 cores(I use a 12 core workstation at home and have seen it exhibit scaling on some of the 16 core systems at work) Although that could be something related to the varying versions of ffmpeg(or I just didn’t notice the changes at 16 cores you’re the expert there not me XD)

  8. I’d love to start seeing some Computation Fluid Dynamics benchmarks for Epyc. It’s memory bandwidth should make the chips competitive…

  9. Could you clarify the differences between the 7501 and 7551? The specs don’t seem to make sense, the 7501 is listed at lower TDP, a higher all core turbo, and a lower price than the 7551. I must be missing something, clearly, but what?

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.