Intel Xeon W-3275 Benchmarks
For this exercise, we are using our legacy Linux-Bench scripts which help us see cross-platform “least common denominator” results we have been using for years as well as several results from our updated Linux-Bench2 scripts. Starting with our 2nd Generation Intel Xeon Scalable benchmarks, we are adding a number of our workload testing features to the mix as the next evolution of our platform.
At this point, our benchmarking sessions take days to run and we are generating well over a thousand data points. We are also running workloads for software companies that want to see how their software works on the latest hardware. As a result, this is a small sample of the data we are collecting and can share publicly. Our position is always that we are happy to provide some free data but we also have services to let companies run their own workloads in our lab, such as with our DemoEval service. What we do provide is an extremely controlled environment where we know every step is exactly the same and each run is done in a real-world data center, not a test bench.
We are going to show off a few results, and highlight a number of interesting data points in this article.
Python Linux 4.4.2 Kernel Compile Benchmark
This is one of the most requested benchmarks for STH over the past few years. The task was simple, we have a standard configuration file, the Linux 4.4.2 kernel from kernel.org, and make the standard auto-generated configuration utilizing every thread in the system. We are expressing results in terms of compiles per hour to make the results easier to read:
With a higher maximum turbo boost, the Intel Xeon W-3275 performs better than the Intel Xeon Platinum 8280. Although they are both 28 core chips with high clock speeds, in this test, there are a few single thread limited parts that allow the Xeon W-3275 to pull ahead here.
c-ray 1.1 Performance
We have been using c-ray for our performance testing for years now. It is a ray tracing benchmark that is extremely popular to show differences in processors under multi-threaded workloads. We are going to use our 8K results which work well at this end of the performance spectrum.
We are simply going to point out here that the Xeon W-3275, even at a lower cost, has a significant advantage over the Xeon Platinum line for single-socket applications.
7-zip Compression Performance
7-zip is a widely used compression/ decompression program that works cross-platform. We started using the program during our early days with Windows testing. It is now part of Linux-Bench.
AMD performs well here. One can see that due to the microarchitecture, AMD is competitive even with lower clock speeds. This is one of the few tests where we see the AMD EPYC 7402P able to top the Xeon W-3275.
NAMD is a molecular modeling benchmark developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. More information on the benchmark can be found here. With GROMACS we have been working hard to support AVX-512 and AVX2 supporting AMD Zen architecture. Here are the comparison results for the legacy data set:
This is an unoptimized test that shows performance in legacy code. Here we see the Intel Xeon W-3275 perform between the 24 and 32 core AMD EPYC parts which we would expect given its 28 cores.
OpenSSL is widely used to secure communications between servers. This is an important protocol in many server stacks. We first look at our sign tests:
Here are the verify results:
This is one of the few tests that the Intel Xeon W-3275 does not pass the Platinum 8280. We also wanted to take a quick second and note that if you are in this price bracket, the AMD competition is not necessarily just the “P” series 24 and 32 core parts. The AMD EPYC 7702P 64 core part sacrifices clock speeds for raw core count. If you are more often thread limited, having more than 2x the number of threads per socket at around the same price is excellent.
UnixBench Dhrystone 2 and Whetstone Benchmarks
Some of the longest-running tests at STH are the venerable UnixBench 5.1.3 Dhrystone 2 and Whetstone results. They are certainly aging, however, we constantly get requests for them, and many angry notes when we leave them out. UnixBench is widely used so we are including it in this data set. Here are the Dhrystone 2 results:
Here are the whetstone results:
AMD also has a 48 core EPYC 7642 part that is around the Intel Xeon W-3275 price range. One gets fewer cores than with the EPYC 7702P, but one also gets higher base clocks and turbo clocks. The Intel solution features a 20W lower TDP. In practice, we have seen the Intel Xeon W-3275 idle a bit lower, but it can also hit slightly higher power consumption using AVX-512.
GROMACS STH Small AVX2/ AVX-512 Enabled
During our initial benchmarking efforts, we have found that our version of GROMACS was taking advantage of AVX-512 on Intel CPUs. We also found that it was not taking proper advantage of the AMD EPYC 7002 architecture. We have had one of the lead developers on our dual AMD EPYC 7742 machine and changes have been implemented. This review is coming out before the new dataset is ready. As a result, we are going to show Intel results for an AVX-512 comparison:
Here we are going to highlight a few Intel Xeon options using AVX-512. You can see that compared to the mainstream Xeon Gold and Platinum line, the Intel Xeon W-3275 is fast. One alternative that we found to be extremely strong when looking at results was the dual Intel Xeon Gold 6242 setup. This provides 32 cores and high clock speeds. One also gets more PCIe and memory capacity which increases a dual system’s utility. There is a trade-off of higher power consumption with the dual-socket solution. Also of note, the dual Xeon Gold 6242 can utilize Optane DCPMM which the Xeon W series cannot.
Chess is an interesting use case since it has almost unlimited complexity. Over the years, we have received a number of requests to bring back chess benchmarking. We have been profiling systems and are ready to start sharing results:
This follows a similar pattern to the results we have seen in previous results. We were slightly surprised that the Intel Xeon Platinum 8280 performed better, but it was consistent across runs. We think this highlights how close the Platinum 8280 and Xeon W-3275 are.
Next, let us discuss market positioning and competitive positioning before getting to our final words.