For this exercise, we are using our legacy Linux-Bench scripts which help us see cross-platform “least common denominator” results we have been using for years as well as several results from our updated Linux-Bench2 scripts. At this point, our benchmarking sessions take days to run and we are generating well over a thousand data points. We are also running workloads for software companies that want to see how their software works on the latest hardware. As a result, this is a small sample of the data we are collecting and can share publicly. Our position is always that we are happy to provide some free data but we also have services to let companies run their own workloads in our lab, such as with our DemoEval service. What we do provide is an extremely controlled environment where we know every step is exactly the same and each run is done in a real-world data center, not a test bench.
We are going to show off a few results, and highlight a number of interesting data points in this article. Specifically, since we just did a Intel Xeon D-2123IT benchmarks and review piece, using a similar Supermicro platform that performed effectively identically, you can skip this section as it is effectively the same from a performance perspective.
Python Linux 4.4.2 Kernel Compile Benchmark
This is one of the most requested benchmarks for STH over the past few years. The task was simple, we have a standard configuration file, the Linux 4.4.2 kernel from kernel.org, and make the standard auto-generated configuration utilizing every thread in the system. We are expressing results in terms of compiles per hour to make the results easier to read:
These charts are a bit busy given how many comparisons we wanted to show. Here, we actually wanted to focus on just how big of a gap there is between this CPU and the Intel Xeon D-2183IT in terms of performance. Many systems and motherboard manufacturers design a PCB, then place a number of different Xeon D SKUs to create different performance levels. There are segments of the embedded market that want just single SKUs, but there are others that want to have this flexibility to minimize design cycles while being able to scale performance.
7-zip Compression Performance
7-zip is a widely used compression/ decompression program that works cross-platform. We started using the program during our early days with Windows testing. It is now part of Linux-Bench.
Here we wanted to highlight performance that was close to and often better than the Intel Atom C3758. The 4 core/ 8 thread configuration with the larger cores means we get more performance than eight smaller cores.
OpenSSL is widely used to secure communications between servers. This is an important protocol in many server stacks. We first look at our sign tests:
Here are the verify results:
We traditionally sort by the verify results which is why the sign chart looks a bit different. Still, we can see that we are a bit ahead of the 4C/ 8T AMD EPYC 3151 here and well ahead of the 4C/ 8T Intel Xeon D-1518.
GROMACS STH Small AVX2/ AVX-512 Enabled
We have a small GROMACS molecule simulation we previewed in the first AMD EPYC 7601 Linux benchmarks piece. In Linux-Bench2 we are using a “small” test for single and dual-socket capable machines. Our medium test is more appropriate for higher-end dual and quad-socket machines. Our GROMACS test will use the AVX-512 and AVX2 extensions if available.
This is always one of the more interesting results. Intel maintains that the Xeon D-2100 series has a single FMA AVX-512 implementation just like the Intel Xeon Silver 4100/ 4200 series. Consistently we see in applications that are AVX-512 enabled that it performs extremely well, significantly better than even the 8 core/ 16 thread Intel Xeon Silver parts. We confirmed this again around the time of the Cascade Lake launch with Intel. This is a great example of where a specific acceleration feature can make a chip leapfrog options that do not have the same feature.
Chess is an interesting use case since it has almost unlimited complexity. Over the years, we have received a number of requests to bring back chess benchmarking. We have been profiling systems and are ready to start sharing results:
The chess benchmarking shows a lot of the same. Some may notice that we use the Intel Xeon Bronze 3206R in many of these charts. In our view, this is the lowest-end 2nd Gen Intel Xeon Scalable Refresh part so it may be one some segments are tempted to run as the low-end mainstream Xeon server solution which is why we wanted to include it here.
Next, we are going to have power consumption, market positioning, and our final words.