Today we have our AMD EPYC 7351P Linux benchmarks and review. This is part of a much larger series as many of our longtime readers may have seen with our Intel Xeon Silver 4114 benchmarks earlier this week. Make no mistake, the AMD EPYC 7351P performance is very good. So much so that it is going to make some of our readers feel a bit uncomfortable about purchases they may have been planning to make.
Background: Heavy Legwork to Build a Useful Dataset
At STH, we are working on a major project. We have over $100,000 worth of current generation AMD EPYC and Intel Xeon Scalable CPUs in the lab. Several racks and 6kW dedicated to a project in the data center. We have the CPUs in-house for over 40% of all the single and dual socket AMD EPYC and Intel Xeon Scalable configurations. That is a huge project that we have already invested over $250,000 in that we will be detailing a bit more on soon. Perhaps one of the more interesting areas from all of these different CPUs is around AMD EPYC’s single socket parts. There are three EPYC SKUs: 7351P, 7401P and 7551P that are identical to their dual socket counterparts except for two areas. First, they are single socket capable and cannot be used in dual socket configurations. Second, they are priced at an enormous discount. Today we are going to publish our first EPYC numbers for a single socket only part, the AMD EPYC 7351P.
Up to this point, the vast majority of benchmarks found online have been ad-hoc, at best in their comparisons. Running so many servers to generate data sets is expensive and we have bought CPUs and systems to accelerate our testing schedule. Beyond that, we also have an extremely controlled data center environment where we monitor temperature and humidity as they are key inputs to overall server performance and power consumption. By scaling up our efforts, we are able to quickly provide a complete comparison set.
Comparing the AMD EPYC 7351P today we have other AMD EPYC CPUs in the sub $1000 price range. We also have the entire Xeon Silver range represented in both single, and where applicable, dual socket configurations. These are Intel’s offerings in the sub-$1000 segment (save the Bronze 3104 and 3106 that we already covered.) Today is when the industry moves from ad-hoc one-to-one comparisons to actionable comparisons. Our goal is that as we release even more of our giant data set, buyers will be able to make informed decisions looking at incremental price and performance.
Key stats for the AMD EPYC 7351P: 16 cores / 32 threads, 2.4GHz base and 2.9GHz turbo with a whopping 64MB L3 cache. The CPU features a 170W TDP. Here is the AMD product page with the feature set. Here is the lscpu output for the processor:
Since the AMD EPYC architecture is going to be new for many, we wanted to provide that CPU feature set output. Although you may see 8MB L3 cache in the lscpu output, the chip actually carries a staggering 64MB L3 cache. That means that this ~$750 CPU has more L2+L3 cache than Intel’s top of the line Xeon Scalable 28 core part. AMD achieves this by using four die per package instead of Intel’s single die design which you can read about in our AMD EPYC and Intel Xeon Scalable Architecture Ultimate Deep Dive.
By the end of September, we will have every AMD EPYC SKU tested on a common Tyan EPYC platform and work started on another platform. Here is the base hardware configuration we are using:
- CPU: AMD EPYC 7351P
- Server Barebones: Tyan Transport SX TN70A-B8026 (B8026T70AE24HR)
- RAM: 8x 16GB 128GB DDR4-2666 RDIMMs (Samsung)
- SSD: 1x Intel DC S3710 400GB SATA SSD
- NIC: 1x Mellanox ConnectX-3 Pro EN VPI
Key to this system is that it supports 24x NVMe U.2 NVMe SSDs without using Broadcom PLX PCIe expanders. That is 96 lanes of PCIe 3.0 directly from a single SKU. One of the key advantages AMD EPYC has is that a single EPYC CPU can use 128x PCIe lanes, the same number as the dual socket configuration. Tyan has responded to this opportunity by offering a single-socket system that can handle 24x NVMe drives plus have I/O available for 10/25/40/50/100GbE.
AMD and Tyan originally suggested that we use a Samsung SSD (as pictured), however, to aid in consistency, we are using our lab standard Intel DC S3710 400GB SSDs.
In our forthcoming system review, we will have data on every CPU from the AMD EPYC 7251 to the EPYC 7601 for those looking at the system.
AMD EPYC 7351P Benchmarks
For this exercise, we are using our legacy Linux-Bench scripts which help us see cross-platform “least common denominator” results we have been using for years as well as several results from our updated Linux-Bench2 scripts. At this point, our benchmarking sessions take days to run and we are generating well over a thousand data points. We are also running workloads for software companies that want to see how their software works on the latest hardware. As a result, this is a small sample of the data we are collecting and can share publicly. Our position is always that we are happy to provide some free data but we also have services to let companies run their own workloads in our lab, such as with our DemoEval service. What we do provide is an extremely controlled environment where we know every step is exactly the same and each run is done in a real-world data center, not a test bench.
Python Linux 4.4.2 Kernel Compile Benchmark
This is one of the most requested benchmarks for STH over the past few years. The task was simple, we have a standard configuration file, the Linux 4.4.2 kernel from kernel.org, and make the standard auto-generated configuration utilizing every thread in the system. We are expressing results in terms of compiles per hour to make the results easier to read.
Here you can see a strong showing from the EPYC 7351P which is going to be a recurring theme. There are a few points to note, first the single socket EPYC 7351P a sub $800 CPU compares very favorably to a $1.591/ hour AWS c4.8xlarge instance as well as much of the E5 lines. Furthermore, we are seeing AMD deliver on its promise of the “P” series single socket parts being able to go head-to-head with the lower-end dual socket parts from Intel both in V4 and the Xeon Silver range.
c-ray 1.1 Performance
We have been using c-ray for our performance testing for years now. It is a ray tracing benchmark that is extremely popular to show differences in processors under multi-threaded workloads.
Benchmarks like c-ray (and Cinebench R15 for Windows) are very sensitive to microarchitectures. They are probably not the best benchmarks to use to compare AMD v. Intel, and often not Intel to Intel across major generations. Still we wanted to provide figures using our legacy c-ray 1.1 test.
As we saw when we actually broke Cinebench R15 using quad Intel Xeon Platinum 8180 CPUs, if you need to generate a lot of threads, you often need a longer benchmark or thread generation becomes a limiting factor. At the higher-end of AMD EPYC and Xeon Gold and Platinum, this becomes a significant consideration. As a result, we started building a more complicated render we are dubbing 8K to give us longer render run times. Here is what a few different EPYC options look like on the larger benchmark:
That is probably more EPYC configurations that anyone has put in a single chart to this point. For those wondering about the EPYC 7281 and EPYC 7301, c-ray is not hitting L3 cache. The EPYC 7301 has twice the EPYC 7281 cache which we will show the impacts of during that review. The simplicity of c-ray and Cinebench R15 hide the benefits from bigger L3 caches.
Interestingly enough, you can see on this chart some definite grouping between 8 core, 16 core, 32 core and 64 core offerings. The 24 core parts are finishing runs but we will have a full benchmark set in the next few days. The AMD EPYC 7601 we see as more of a niche part given its price tag while the EPYC 7200 and 7300 series parts are certainly more mainstream.
7-zip Compression Performance
7-zip is a widely used compression/ decompression program that works cross-platform. We started using the program during our early days with Windows testing. It is now part of Linux-Bench.
This is a crowded chart, but the raw core count is propelling the AMD EPYC 7351P to some awesome figures. To get above the AMD EPYC 7351P here requires dual Silver 4114 CPUs at nearly 2x the price.
NAMD is a molecular modeling benchmark developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. More information on the benchmark can be found here. We are going to augment this with GROMACS in the next-generation Linux-Bench in the near future. With GROMACS we have been working hard to support Intel’s Skylake AVX-512 and AVX2 supporting AMD Zen architecture. Here are the comparison results for the legacy data set:
Although we are transitioning to GROMACS, we have a huge NAMD data set that is not optimized for AVX-512. We wanted to use both the dual EPYC 7251 and dual EPYC 7281 configurations here to show the power of the P series parts from AMD. By offering these single-socket specific CPUs, the AMD 7351P certainly allows one to use one socket instead of two.
Sysbench CPU test
Sysbench is another one of those widely used Linux benchmarks. We specifically are using the CPU test, not the OLTP test that we use for some storage testing.
Here the AMD EPYC 7351P falls between the dual socket Intel Xeon Silver 4108 and Silver 4110 configurations. That is a stellar result given both are priced around the same as a single EPYC 7351P.
OpenSSL is widely used to secure communications between servers. This is an important protocol in many server stacks. We first look at our sign tests:
And the verify numbers:
Again, we see the AMD EPYC 7351P performance shine here putting it near the dual Xeon Silver 4110. That certainly supports AMD’s value proposition on the P series parts. It is also why we are seeing so many vendors enter the market with AMD single-socket servers.
UnixBench Dhrystone 2 and Whetstone Benchmarks
Some of the longest-running tests at STH are the venerable UnixBench 5.1.3 Dhrystone 2 and Whetstone results. They are certainly aging, however, we constantly get requests for them, and many angry notes when we leave them out. UnixBench is widely used so we are including it in this data set. Here are the Dhrystone 2 results:
And the whetstone results:
This is certainly an awesome showing for the AMD EPYC 7351P. Intel’s competitive single socket part (price wise) is the Xeon Silver 4114. Figures for that CPU are well below what we are seeing for the EPYC line.
GROMACS STH Small AVX2/ AVX-512 Enabled
We have a small GROMACS molecule simulation we previewed in the first AMD EPYC 7601 Linux benchmarks piece. In Linux-Bench2 we are using a “small” test for single and dual socket capable machines. Our medium test is more appropriate for higher-end dual and quad socket machines. Our GROMACS test will use the AVX-512 and AVX2 extensions if available.
We added a few larger and more expensive configurations, namely the dual EPYC 7281 and single Xeon Gold 6138 results here to give some sense of perspective. Even though AVX-512 is a key feature for Intel Xeon Scalable, Intel made a product decision to neuter its effectiveness on the Xeon Silver line. It did so both by lowering clock speeds and removing the second compute unit. As a result, AMD EPYC supporting AVX2 not AVX-512 is able to keep pace even with Xeon Silver configurations that are 2x or more of the price.
Adding the lone Xeon Gold 6138 result here was simply to show how much the Xeon Gold 6100 and Platinum series CPUs benefit from higher clocks and AVX-512. Even the 14 core / 28 thread Xeon Gold 6132 will easily outpace the 16 core AMD EPYC 7351P by a wide margin. For that AVX-512 performance, the Xeon Gold 6132 costs almost 3x what an AMD EPYC 7351P does. From a system perspective, if you were doing heavy AVX-512 Intel still has a strong value proposition with their Xeon Gold series over AMD EPYC. With the product feature segmentation Intel does on the Xeon Silver line, it is simply not competitive.
Chess is an interesting use case since it has almost unlimited complexity. Over the years, we have received a number of requests to bring back chess benchmarking. We have been profiling systems and are ready to start sharing results:
Lots of cores and higher clock speeds help the AMD EPYC here. In fact, the only EPYC that falls below the Intel Xeon Silver 4116, a $1000 part) in this workload is the AMD EPYC 7251 a $450 part.
A Note on Power Consumption
The other side of the equation is power consumption. The AMD EPYC 7351P is putting up some impressive benchmark numbers, but that does have an associated cost. Here is what we saw on our PDU after a few runs:
Idle power consumption was around 75W measured on the 208V PDU at 17.7C and 71% RH. Our testing window shown here had a +/- 0.3C and +/- 2% RH variance.
In terms of power consumption, and again this is without 24x NVMe drives installed, this is an idle result comparable to Xeon Silver. The maximum power consumption and what we are showing as a 70% load are both significantly higher than a single socket Xeon Silver or a dual Xeon Silver 4110 system by a wide margin. The dual Intel Xeon Silver 4110 setup will not hit above 200W running the same workloads as an example.
AMD is delivering a lot of performance in the sub $1000 segment. At the same time, that comes at the expense of power consumption. If you are saving $1000 on CPUs and systems and spending $15/ mo for another .12kW, that is a fairly easy TCO calculation to do.
As we pointed out in our dual AMD EPYC 7251 review, AMD has a number of competitive vectors. The obvious competition is the Intel Xeon single socket line. As a P series part, AMD is also positioning the CPU against lower-end dual Intel CPU configurations. We also see some competition from the dual AMD EPYC 7251 and the AMD EPYC CPUs around the same price range.
AMD EPYC 7351P v. Intel Xeon Silver
Since the AMD EPYC 7351P is a $750 part (although we are seeing street pricing slightly above that at the time of publishing around $820), it is not competing directly with Intel Gold CPUs, instead, it is competing with the Xeon Silver range. With the Intel Xeon Silver 4108 priced at just over $400, we see the EPYC 7351P priced as a significant competitor to that configuration. Likewise, the Intel Xeon Silver 4114 is a similarly priced single-socket Intel Xeon Silver option.
On the question of Intel Xeon Silver 1P and 2P configurations versus the 7351P, there are a few main points to consider. First, AMD flat out has more PCIe lanes with 128. Intel has 48 on the Intel Xeon Silver, although in dual socket configurations (96x) you could argue using a PCH does bestow some benefit in using those lanes efficiently. Second is memory capacity. The AMD EPYC 7351P can handle 16x DDR4-2666 DIMMs (8 channel) and up to 2TB of memory. Intel Xeon Silver can handle up to 12x DDR4-2400 DIMMs and 768GB of memory in single socket mode and twice that in dual socket configurations. Third is in terms of platform. Intel Xeon Scalable has more platform options available, and there are a few features that are more mature on the Xeon Scalable platform (e.g. NVMe hot-plug/ swap and QuickAssist) that AMD platforms still have to catch-up on in terms of ecosystem maturity.
If you run Dockerized microservices and are not leveraging features such as QAT, EPYC simply has Intel Xeon Silver beat. Likewise, as in the case with this Tyan test system, AMD having more PCIe lanes makes a compelling argument. It is not that Intel does not have parts that compete with AMD. Intel’s product segmentation including features and pricing where features like dual AVX-512 are not present on low-end SKUs, DDR4-2666 and higher memory capacities are not supported on Xeon Silver. Omni-Path is not supported on Xeon Silver (as on package fabric.) In terms of volume, the Intel Xeon Bronze and Silver are low value but high volume parts and Intel needs to do some soul-searching. Intel Xeon Silver does have better power consumption characteristics which is important yet we have seen IT buyers accept higher power consumption for higher performance.
What AMD essentially did with the EPYC 7351P is put forward a high-performance offering in a price segment where Intel is geared specifically for low power use. To be clear, Intel does have high-performance chips. The sub $1000 CPU segment for Intel is focused solely on low power consumption, not performance.
In 2017/ 2018 it may be enough that Intel is Intel and has commanded huge market share for years. In upcoming generations, that is going to be a harder sell. Either way, if you have a dev ops team EPYC should be a part of this year’s server purchases.
AMD EPYC 7351P v. AMD EPYC
In the x86 market, the AMD EPYC 7351P is not just competing against Intel, but also other SKUs in the AMD stack. Two AMD EPYC 7251’s are slightly more than a single AMD EPYC 7351P. The ability to use more memory per system or more lower capacity and less costly DIMMs may make that a compelling solution for buyers.
Unlike many of the other SKUs, as a “P” series part, the AMD EPYC 7351P sits between the EPYC 7281 and EPYC 7301 pricing wise. For us, the choice is clear, if you are using a single socket system, get the P series EPYC parts in that price range. A good debate when purchasing this is whether or not to move up to the EPYC 7401P since one gets 50% more cores at slightly lower clock speeds for a relatively small price premium. We are going to have more on that soon, but essentially moving one “P” part up the stack pits a $1075 AMD CPU with 24 cores against the 12 core Xeon Silver 4116 at around $1020. While Intel may have some IPC advantages in some workloads, the de-featuring of the Xeon Silver line makes AMD’s offerings really shine even more so than we are already seeing with the EPYC 7351P.
At STH, our first introduction to AMD EPYC was the dual EPYC 7601 configuration. Our test system originally had an AMD EPYC 7601 installed and we were planning on first publishing 1P 7601 results. As we worked through various other configurations, it became clear that while AMD is very competitive at the high-end, its mainstream offerings are competing with de-featured Xeon Silver CPUs and absolutely obliterate what Intel is offering. Addressing the obvious, one has to be willing to be different at their job and deviate from purchasing Intel x86 to buy AMD instead. Here one can take solace in the fact that the AMD EPYC 7351P is not marginally better than the Intel Xeon Silver single (and low-end dual) socket offerings, but instead, there is an enormous chasm created by Intel’s conservative product offerings in the Xeon Silver space versus AMD’s aggressive moves in the mainstream market with the EPYC 7351P. While the EPYC 7351P looks impressive compared to Xeon Silver, we are soon going to show the EPYC 7401(P) numbers that will make you pause.
We also run our own hosting cluster that is currently all Intel-based. AMD EPYC platform maturity and our experience with the Tyan Transport SX TN70A-B8026 thus far means we will be adding AMD EPYC in the next round of upgrades in Q4 2017/ Q1 2018. AMD systems and CPUs are starting to hit the channel for smaller buyers like us. We now have systems from several different vendors already up and running in the lab and the overall platform maturity has come a long way since June.