If you have been following along with STH, we find AMD EPYC fascinating. At the low-end of the market, AMD EPYC offers more cores, more cache, more memory capacity, and more PCIe lanes than Intel can offer. At the higher-end of the market, starting with the AMD EPYC 7501, Intel in some ways has no competitive answer. The AMD EPYC 7501, at under $3700, can provide 32 cores / 64 threads and 64MB L3 cache in a single socket. The closest Intel can come is 28 cores. If your application is directly tied to the number of cores/ threads (e.g. mobile gaming where some services assign one core per user) AMD has something even Intel’s top tier $10,000 Platinum 8180 can match. Of course, to keep 32 threads in the power and price bands of the EPYC 7501, AMD had to craft a part with unique characteristics. Today we are going to explore the CPU and see what it gets you.
Key stats for the AMD EPYC 7501: 32 cores / 64 threads, 2GHz base and 2.6GHz all core turbo and 3.0GHz max turbo boost with a whopping 64MB L3 cache. The CPU features a 170W TDP. Here is the AMD product page with the feature set. This CPU can run in dual socket mode. Here is the lscpu output for the processor:
The key here is that the single socket AMD EPYC 7501 uses a four NUMA node implementation. You can read more about why in our AMD EPYC 7000 Series Architecture Overview for Non-CE or EE Majors article or learn about it in this video:
For those still unfamiliar with the AMD EPYC 7000 series here is AMD’s key value proposition bullets:
- More cores for less money. This chip retails for under $3700. Intel does not have a competing 32 core part at this time.
- More DRAM capacity. Each AMD EPYC has 8 channel memory (Xeon Scalable has 6 channel). In 16 DIMMs per socket, the AMD EPYC 7000 series can support up to 2TB of RAM per socket. The “M” series Xeon Scalable SKUs can only hit 1.5TB per socket, up from 768GB on a standard CPU, and carry a $3000 price tag for the privilege.
- 128 high speed I/O (PCIe/ SATA III) lanes in either single or dual socket mode. In single socket mode, AMD has up to 128 I/O lanes while Intel essentially has 48. You can learn more in our Single Socket AMD EPYC 7000 FAQ Answers to Common Questions.
- EPYC 7000 series is x86. You can get an Intel alternative architecture without needing major code ports or special support. Enterprise software will just work.
AMD EPYC 7000 series also has a few single socket only SKUs that have extremely low pricing. In the 32-core class, the single socket value leader is the AMD EPYC 7551P. The AMD EPYC 7501 is dual socket capable chip but we are providing benchmark figures simply to give a relative performance ranking.
Next, we are going to look at our test setup and configuration before we move on to benchmarks, power consumption, and our market positioning analysis.
AMD EPYC 7501 Test Configuration
By the end of September, had every AMD EPYC SKU tested on a common Tyan EPYC platform and work started on another platform. Here is the base hardware configuration we are using:
- CPU: AMD EPYC 7501
- Server Barebones: Tyan Transport SX TN70A-B8026 (B8026T70AE24HR)
- RAM: 8x 16GB 128GB DDR4-2666 RDIMMs (Samsung)
- SSD: 1x Intel DC S3710 400GB SATA SSD
- NIC: 1x Mellanox ConnectX-3 Pro
Key to this system is that it supports 24x NVMe U.2 NVMe SSDs without using Broadcom PLX PCIe expanders. That is 96 lanes of PCIe 3.0 directly from a single SKU. One of the key advantages AMD EPYC has is that a single EPYC CPU can use 128x PCIe lanes, the same number as the dual socket configuration. Tyan has responded to this opportunity by offering a single-socket system that can handle 24x NVMe drives plus have I/O available for 10/25/40/50/100GbE.
AMD and Tyan originally suggested that we use a Samsung SSD (as pictured), however, to aid in consistency, we are using our lab standard Intel DC S3710 400GB SSDs.
This is a great system that has worked well over the past several quarters. If you have an existing Intel Xeon E5 V1-V4 installation, it is likely that a single socket AMD EPYC 7000 series machine using NVMe drives can replace a dual socket or multiple dual socket previous generation servers. We have seen companies consolidate as many as four dual socket Intel Xeon E5-2620 V1 servers into a single AMD EPYC 7000 series server which is a great consolidation savings.
Next up we have our AMD EPYC 7501 benchmarks followed by power consumption, market positioning, and our final thoughts.
“Here AMD EPYC is performing well but this is one where the dual port FMA Xeon AVX-512 is a big advantage.”
NAMD Performance on Xeon-Scalable 8180 and 8 GTX 1080Ti GPUs on Pugetsystems.
I don’t know how much Servethehome gets from parties to mention the pro’s of AVX-512, compared with a GPU it is totally useless.
There are a lot of new HPC applications being tuned for AVX-512. Just using GROMACS, the example you are focusing on, here is what one of the lead devs of GROMACS affirmed to you the last time you posted a similar comment about a month ago: https://www.servethehome.com/intel-xeon-gold-6132-benchmarks-and-review/
Even the ARM vendors we work with acknowledge AVX-512 is getting a lot of attention.
Is this review done with Spectre and Meltdown mitigations in place?
Is there any chance of an ffmpeg encode benchmark to be added in the future? It would be very interesting to see how AMD compares against Intel given that video encoding/streaming is such a key workload that demands large scale CPU capacity. I’m more than happy to work together in setting the parameters for such a benchmark and supply quality media for the best comparable results.
David, happy to take a look at what you have. Docker container perhaps?
Are they benchmarks for all of the processors available somewhere? I’m particularly interested in the NAMD and Gromacs results.
“Even the ARM vendors we work with acknowledge AVX-512 is getting a lot of attention.”
I would do the same when I couldn’t get my hands on GPU IP.
($20k)1x 8180 + 1x Tesla V100 has at least the same speed with these kind of calculations (fp32 and fp64) as ($50k)5x 8180. With optimized software the difference will be even bigger.
@patrick @david I use this as a media encoding benchmark for a site I write for. https://nwgat.ninja/thefireescape/ It’s built on FFMPEG and supports both an x264 and x265 export, the included scripts run it for 5 runs and I have seen variances thanks to the presence of AVX on newer cpu’s
Daniel, thank you for the input. We have run that. It is not something we can use due to it being Windows-based and the fact it does not scale well. We have a set of criteria a benchmark must meet and that was hitting scaling limits by only 12 cores. For consumer CPUs, it may be useful. It can potentially be useful for multiple streams if logic is built to do QoS in transcoding spinning up multiple containers. Sadly, it is far from making the cut on what we can use at this point. I do want a x265 benchmark but am still looking for a good one.
Ahh I didn’t know how hard it would be to move to linux, shotcut itself is foss and is built for linux as well although that particular benchmark was built for windows I thought the python logic would be portable. That said it seems to scale well past 12 cores(I use a 12 core workstation at home and have seen it exhibit scaling on some of the 16 core systems at work) Although that could be something related to the varying versions of ffmpeg(or I just didn’t notice the changes at 16 cores you’re the expert there not me XD)
I’d love to start seeing some Computation Fluid Dynamics benchmarks for Epyc. It’s memory bandwidth should make the chips competitive…
Could you clarify the differences between the 7501 and 7551? The specs don’t seem to make sense, the 7501 is listed at lower TDP, a higher all core turbo, and a lower price than the 7551. I must be missing something, clearly, but what?
+1 for encoding benchmarks!