Quad Intel Xeon Platinum 8276L Benchmarks and Review

3

Quad Intel Xeon Platinum 8276L Benchmarks

For this exercise, we are using our legacy Linux-Bench scripts which help us see cross-platform “least common denominator” results we have been using for years as well as several results from our updated Linux-Bench2 scripts. Starting with our 2nd Generation Intel Xeon Scalable benchmarks, we are adding a number of our workload testing features to the mix as the next evolution of our platform.

At this point, our benchmarking sessions take days to run and we are generating well over a thousand data points. We are also running workloads for software companies that want to see how their software works on the latest hardware. As a result, this is a small sample of the data we are collecting and can share publicly. Our position is always that we are happy to provide some free data but we also have services to let companies run their own workloads in our lab, such as with our DemoEval service. What we do provide is an extremely controlled environment where we know every step is exactly the same and each run is done in a real-world data center, not a test bench.

We are going to show off a few results, and highlight a number of interesting data points in this article.

Python Linux 4.4.2 Kernel Compile Benchmark

This is one of the most requested benchmarks for STH over the past few years. The task was simple, we have a standard configuration file, the Linux 4.4.2 kernel from kernel.org, and make the standard auto-generated configuration utilizing every thread in the system. We are expressing results in terms of compiles per hour to make the results easier to read:

Quad Intel Xeon Platinum 8276L Linux Kernel Compile Benchmark
Quad Intel Xeon Platinum 8276L Linux Kernel Compile Benchmark

Here we see the quad Intel Xeon Platinum 8276L just about where we would expect with a slight speed bump over the Xeon Platinum 8176.

Although our tests have evolved since STH was doing Intel Xeon E7 testing, here is a very interesting comparison to keep in mind:

Maximum RAM In TB For Intel 4S And 8S CPUs Q2 15 To Q2 19
Maximum RAM In TB For Intel 4S And 8S CPUs Q2 15 To Q2 19

The Intel Xeon Platinum 82xxL parts, like the Intel Xeon Platinum 8276L we have here, are the first time that Intel has offered a maximum RAM increase in this segment since the 2017 Intel Xeon E7-88xx series.

c-ray 1.1 Performance

We have been using c-ray for our performance testing for years now. It is a ray tracing benchmark that is extremely popular to show differences in processors under multi-threaded workloads. We are going to use our 8K results which work well at this end of the performance spectrum.

Quad Intel Xeon Platinum 8276L C Ray 8K Benchmark
Quad Intel Xeon Platinum 8276L C Ray 8K Benchmark

We did not have the c-ray 8K test when we did our Intel Xeon E7 testing such as with the Dell PowerEdge R930.

7-zip Compression Performance

7-zip is a widely used compression/ decompression program that works cross-platform. We started using the program during our early days with Windows testing. It is now part of Linux-Bench.

Quad Intel Xeon Platinum 8276L 7zip Compression Benchmark
Quad Intel Xeon Platinum 8276L 7zip Compression Benchmark

We wanted to provide a chart “de-noised” from previous generations. Here one can see a quick comparison with nice scaling between the quad Intel Xeon Platinum 8276L, quad Intel Xeon Platinum 8260, and quad Intel Xeon Gold 6242.

NAMD Performance

NAMD is a molecular modeling benchmark developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. More information on the benchmark can be found here. We are going to augment this with GROMACS in the next-generation Linux-Bench in the near future. With GROMACS we have been working hard to support Intel’s Skylake AVX-512 and AVX2 supporting AMD Zen architecture. Here are the comparison results for the legacy data set:

Quad Intel Xeon Platinum 8276L NAMD Benchmark
Quad Intel Xeon Platinum 8276L NAMD Benchmark

Here we see solid scaling putting the quad Intel Xeon Platinum 8276L between the quad Intel Xeon Platinum 8180 and Platinum 8176 figures.

OpenSSL Performance

OpenSSL is widely used to secure communications between servers. This is an important protocol in many server stacks. We first look at our sign tests:

Quad Intel Xeon Platinum 8276L OpenSSL Sign Benchmark
Quad Intel Xeon Platinum 8276L OpenSSL Sign Benchmark

Here are the verify results:

Quad Intel Xeon Platinum 8276L OpenSSL Verify Benchmark
Quad Intel Xeon Platinum 8276L OpenSSL Verify Benchmark

Here we see great performance. The Intel Xeon E7-8890 V4 is almost a predecessor in the Intel Xeon Platinum 8276 swim lane with around the same price tag. That Intel Xeon E7-8870 V4 supports 3TB of memory which is more than the Intel Xeon Platinum 8276. The newer CPUs get more cores and higher clock speeds, which improve performance a significant amount.

UnixBench Dhrystone 2 and Whetstone Benchmarks

Some of the longest-running tests at STH are the venerable UnixBench 5.1.3 Dhrystone 2 and Whetstone results. They are certainly aging, however, we constantly get requests for them, and many angry notes when we leave them out. UnixBench is widely used so we are including it in this data set. Here are the Dhrystone 2 results:

Quad Intel Xeon Platinum 8276L UnixBench Dhrystone 2 Benchmark
Quad Intel Xeon Platinum 8276L UnixBench Dhrystone 2 Benchmark

Here are the whetstone results:

Quad Intel Xeon Platinum 8276L UnixBench Whetstone Benchmark
Quad Intel Xeon Platinum 8276L UnixBench Whetstone Benchmark

Absolutely these are too old of benchmarks for a modern quad-socket system. However, we are simply going to present the data. Perhaps one of the more interesting aspects is that they are still scaling. Many virtualized workloads are legacy applications optimized years ago, if at all.

GROMACS STH Medium AVX2/ AVX-512 Enabled

We have a small GROMACS molecule simulation we previewed in the first AMD EPYC 7601 Linux benchmarks piece. In Linux-Bench2 we are using a “medium” test across quad socket nodes. Our GROMACS test will use the AVX-512 and AVX2 extensions if available.

Quad Intel Xeon Platinum 8276L GROMACS STH Medium Benchmark
Quad Intel Xeon Platinum 8276L GROMACS STH Medium Benchmark

We very rarely use our medium case since our “small” case tends to work very well across the range of single and dual CPU configurations we test. Here, one can see very strong performance.

Chess Benchmarking

Chess is an interesting use case since it has almost unlimited complexity. Over the years, we have received a number of requests to bring back chess benchmarking. We have been profiling systems and are ready to start sharing results:

Quad Intel Xeon Platinum 8276L Chess Benchmark
Quad Intel Xeon Platinum 8276L Chess Benchmark

Here we have fewer results again simply because this was added later than some of our other tests. the quad Intel Xeon Platinum 8276L configuration again performed well here.

STH STFB KVM Virtualization Testing

One of the other workloads we wanted to share is from one of our DemoEval customers. We have permission to publish the results, but the application itself being tested is closed source. This is a KVM virtualization based workload where our client is testing how many VMs it can have online at a given time while completing work under the target SLA. Each VM is a self-contained worker.

Quad Intel Xeon Platinum 8276L KVM STFB SLA Workload 1 Benchmark
Quad Intel Xeon Platinum 8276L KVM STFB SLA Workload 1 Benchmark

There is an interesting result here. Even with being mostly CPU bound, we see an ever so slight departure that was consistent between runs at the “small” VM sizes due to using pure DDR4 versus DDR4 + Intel Optane DCPMM in the quad Intel Xeon Platinum 8276L configuration.

The company also has a CPU-light back-end workload that is mostly dependent on Redis performance and memory capacity with less of a CPU stressor.

Quad Intel Xeon Platinum 8276L KVM STFB SLA Workload 2 Benchmark
Quad Intel Xeon Platinum 8276L KVM STFB SLA Workload 2 Benchmark

Since we are looking at the quad Intel Xeon Platinum 8276L here, we wanted to show off the impact of the “L” in terms of memory capacity with Intel Optane DCPMM. One can see much higher utilization in the larger VM sizes mostly driven by larger memory capacity. Using DDR4 only, the results are much closer.

Next, we are going to discuss market positioning before our final words.

3 COMMENTS

  1. Hi,

    When you are talking about STH budgets and pricing, are you suggesting you are actually purchasing these systems retail just to review them?

    Shouldent Intel (or the other brands) be providing you with samples?

    Navi

  2. Hi Navi – we purchase six figures worth of hardware each year in addition to what vendors supply for reviews. It takes a lot to do this testing. Just the data center costs we have are well over $50K annually using low cost providers.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.