AMD EPYC 7002 CPU Performance
For this exercise, we are using a mix of our legacy Linux-Bench scripts which help us see cross-platform “least common denominator” results we have been using for years as well as several results from our updated Linux-Bench2 scripts. Starting with our 2nd Generation Intel Xeon Scalable benchmarks, we are adding a number of our workload testing features to the mix as the next evolution of our platform.
At this point, our benchmarking sessions take days to run and we are generating well over a thousand data points. We are also running workloads for software companies that want to see how their software works on the latest hardware. As a result, this is a small sample of the data we are collecting and can share publicly. Our position is always that we are happy to provide some free data but we also have services to let companies run their own workloads in our lab, such as with our DemoEval service. What we do provide is an extremely controlled environment where we know every step is exactly the same and each run is done in a real-world data center, not a test bench.
We are going to show off a few results, and highlight a number of interesting data points in this article.
Python Linux 4.4.2 Kernel Compile Benchmark
If we take a look at single-socket only benchmarks, here is what the picture looks like:
The AMD EPYC 7402P with 24 cores is priced as an Intel Xeon Gold 5218 competitor. With more cores, it is able to push ahead being over twice as fast. That is a big deal. The Intel Xeon Gold 6209U is a $1350 version of the Intel Xeon Gold 6230 CPU that is single socket only. Here, the AMD EPYC 7402P is about two thirds faster yet 10% less expensive.
As we are going to continue to see, 64 cores of AMD EPYC 7002 are just on a different level than Intel is playing at in socketed CPU designs.
7-zip Compression Performance
7-zip is a widely used compression/ decompression program that works cross-platform. We started using the program during our early days with Windows testing. It is now part of Linux-Bench.
Many cores, bigger caches, more memory bandwidth, and a newer microarchitecture all help AMD quite a bit here. Intel asked us to compare a similar number of cores to be fair to their chips. In that spirit, the dual AMD EPYC 7502 configuration has 32 cores each while the dual Intel Xeon Platinum 8280 only has 28 cores each. AMD does not have 28 core SKUs.
On the other hand, the dual AMD EPYC 7502 is faster and uses $2600 each list price SKUs while the Intel Xeon Platinum 8280 is a $10,007 list price SKU. AMD provides four more cores per socket which help performance, but they are doing so at an initial list price discount of around 74%. That is a big deal if you assume Intel Xeon Platinum list pricing is designed so server vendors can utilize large 60% discounts. AMD still has the better platform, but we think here the Xeon Platinum 8280 can be very competitive with an 80% discount off of list price.
NAMD is a molecular modeling benchmark developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. More information on the benchmark can be found here. With GROMACS we have been working hard to support AVX-512 and AVX2 supporting AMD Zen architecture. Here are the comparison results for the legacy data set:
We wanted to show more SKUs here. First off, on these CPUs this is highly unoptimized. We are not using AVX2 and AVX-512 and we are using gcc. At the same time, it is important. Software runs in data centers for years. Not everything uses the latest instructions, or even can. There are countless legacy applications out there. With unoptimized code, there is an enormous uplift for the newer generation AMD EPYC 7002 series. The AMD EPYC 7402P and EPYC 7401P have the same core count, yet the new chip is about 25% faster.
Intel delivered huge performance gains with this generation as we saw in our Intel Xeon Gold 5218 Benchmarks and Review earlier this week. AMD is delivering enormous gains at the same core count as well. Here, when the software is not well optimized for either modern architecture, having more brute power to handle the workload helps. That is why the Platinum 8280 with its 28 high-speed cores is faster than the AMD EPYC 7402P, and why the AMD EPYC 7702P we call a 2-socket replacement part. It is simply that much faster.
Next, we are going to continue with our benchmarks starting with OpenSSL.
OpenSSL is widely used to secure communications between servers. This is an important protocol in many server stacks. We first look at our sign tests:
Here are the verify results:
OpenSSL is one of the most used functions in modern architectures. If you are reading this, or virtually anything else online about the AMD EPYC 7002 series, you are doing so over an HTTPS connection using OpenSSL.
The top of the chart is fairly self-explanatory. AMD has more cores, and that leads to better performance. Dual Intel Xeon Platinum 8280‘s perform slightly better than the AMD EPYC 7702P. The AMD EPYC 7402P is slightly faster than the Intel Xeon Platinum 8260 but they are fairly close. The Intel Xeon Platinum 8260 has a single-socket counterpart, the Intel Xeon Gold 6210U with a $1500 list price. Intel has higher clock speeds while AMD has more cores, cache, and memory bandwidth. The cost of the two chips is very similar.
One other quick note here on power consumption. The dual Intel Xeon Platinum 8280 configuration was using more than 230W more than the AMD EPYC 7702P server, and about 200W more than the single AMD EPYC 7742 server here. That can be partially attributed to using four additional DIMMs. We have heard cloud vendors use $6 per watt as their 1W TCO for gear. Also, the single socket AMD EPYC server itself we expect to see sold for several hundred less than the dual-socket Intel server. We think Intel is still very competitive here with Xeon Platinum 8280 street pricing in the $1000-1200 range.
A Quick Note on OpenSSL and Intel QAT Accelerators
Being fair here, Intel has its QuickAssist technology which can accelerate OpenSSL. You can see our Intel QuickAssist Technology and OpenSSL Benchmarks and Setup Tips piece as well as our Intel QuickAssist at 40GbE Speeds: IPsec VPN Testing to see the impact. Intel has Lewisburg PCH options with built-in QAT. Instead of making this a universal accelerator for Xeon, Intel’s decision to put the functionality only into higher-spec PCH’s has thwarted mainstream server adoption. Intel will point out that QAT is widely adopted in the telecom space where Xeon D and Atom chips can have this built-in.
On the other hand, the Dell PowerEdge, HPE ProLiant, Supermicro Ultra, Lenovo ThinkSystem, and Inspur N-series Xeon Scalable servers we have in the lab all use lower-cost PCH options without QAT. Server vendors that include QAT enabled PCH’s tend to also route extra PCIe lanes from the CPU to the PCH which has a similar impact to using an add-in accelerator.
Intel can do better here with QAT, however, one needs to integrate a QAT accelerator into their stack. One also needs to ensure their stack supports the correct QAT version. Although we have had working QAT stacks, it is far from a universal plug-and-play experience.
Next, we are going to continue with more benchmarks.