Advertisement


Home News SPEC Consortium Releases SPEC CPU 2026 Benchmark Suite: The Next Decade of...

SPEC Consortium Releases SPEC CPU 2026 Benchmark Suite: The Next Decade of CPU Benchmarking

0
SPEC CPU2026 Launch STH Cover
SPEC CPU2026 Launch STH Cover

Today marks a fun day for the world of CPU benchmarking, as the Standard Performance Evaluation Corporation (SPEC) has released a new version of their popular and widely used SPEC CPU benchmark suite. SPEC CPU 2026, as it is known as, it the latest iteration in a long line of CPU benchmark suites from the group, which over time has become the industry leading benchmark for measuring CPU performance thanks to its intense technical rigor and significant vendor buy-in. As a result, the release of a new version of the CPU suite is a major occasion in the world of CPU and system benchmarking, as it is one of the most important benchmark suites that the next decade of CPUs will be measured by.

The first new release of SPEC CPU in 9 years, SPEC CPU 2026, is intended to be a pretty extensive modernization of the CPU benchmarking suite. Most of the benchmarks from the previous version of the suite have been dropped in favor of newer workloads, and even the few that remain. Evergreen software, such as open-source compilers and image processing libraries, has also seen its workloads updated to be more modern. The end result is that, while SPEC CPU 2026 still fulfills the same high-level design goals as its predecessors, the actual code running underneath has changed significantly since the era of Skylake, Zen 1, and Cortex-A75.

SPEC_CPU_2026_Execution
SPEC_CPU_2026_Execution

Besides its technical rigor, the SPEC CPU suite is also notable for its industry participation; the benchmark continues to be developed by a group that is essentially a who’s who of the CPU world, including AMD, Intel, Arm, and SiFive. The high amount of vendor involvement and subsequent buy-in for the suite helps to contribute to SPEC CPU’s legitimacy and acceptance within the industry, as there is broad agreement that it is a fair benchmark – and perhaps more importantly, no significant claims that it is not. Which is a minor political minefield to navigate at times, but it means the benchmark is extremely well tested and vetted before release, which again dovetails with cementing the benchmark’s legitimacy and utility.

Finally, ahead of today’s initial release of SPEC CPU 2026, the consortium gave us early access to the release version of the benchmark to evaluate its performance. Consequently, alongside today’s high-level announcement, we have quite a bit of fresh data to pore through. So, without further ado, let us get started.

A Quick Recap of SPEC CPU

With an average release cadence over the last two decades of roughly every 10 years, despite how much we use SPEC CPU and other SPEC benchmarks here at STH, we do not get much of a chance to discuss their origins or the design decisions that go into them. So, if this is the first time you have been around for a release of a new SPEC CPU benchmark suite, here is a handy and brief rundown of what SPEC CPU is and why it matters.

The Standard Performance Evaluation Corporation (SPEC) is a group whose mission is right there in the name. It is a non-profit industry consortium that exists to organize the development of industry-standard benchmarks. As with other industry consortia, the purpose of the group is not so much to develop tests on its own, but rather to bring together major players from industry and academia to work together to develop benchmark suites.

SPEC_CPU_2026_Contributors
SPEC_CPU_2026_Contributors

Of SPEC’s rather significant collection of benchmark suites, by far the highest-profile is SPEC CPU. This is both because of its vintage, it was SPEC’s first benchmark, released back in 1989, but also because of how widely it is used. Though ostensibly designed for benchmarking servers and other high-performance systems, SPEC CPU has become widely used at multiple levels of devices, including desktop and workstation computers, and most recently, mobile devices, as those processors have become powerful enough (and paired with enough memory) to run the benchmark suite. As a result, SPEC CPU has become one of the primary benchmarks for comparing CPU architectures across the spectrum and is used to evaluate the performance, power efficiency, and architectural efficiency (IPC) of various CPUs.

SPEC_CPU_History
SPEC_CPU_History

The benchmark suite’s ubiquity stems from its portability. Rather than being distributed as pre-compiled binaries for specific platforms, the SPEC CPU benchmark suite is distributed entirely as source code. As a result, with a modern compiler toolchain that supports C, C++, and Fortran, it is possible to compile and run SPEC CPU on any system, regardless of CPU architecture, even those that do not yet exist. There is admittedly quite a bit of nuance in that detail (you need standards-compliant compilers, among other things), but it means that the benchmark is not reliant on SPEC supporting a specific platform, and that hardware developers can immediately employ and benchmark new hardware features without needing to wait on newer binaries from upstream.

The corollary is that, rather uniquely, SPEC CPU is as much a benchmark of compilers as it is of hardware. Because the source code is a fully portable, high-level implementation of a program, and thus contains no CPU intrinsics or other architecture-specific code, the benchmark is at the mercy of a good compiler to turn it into fast, efficient machine code. This means that compiler improvements to boost your SPEC CPU scores are fair game (an especially important aspect for hardware vendors who produce their own compilers), but the catch is that those optimizations cannot be for SPEC alone; they need to benefit a wider class of programs.

Lbm_source
Lbm_Benchmark Source

Ultimately, this is a significant distinction from most other CPU benchmarks, which are distributed as pre-built binaries. In effect, whereas those benchmarks strive to measure how well a given system executes a very specific batch of machine code, SPEC CPU is all about measuring algorithms and workloads. It is a distinction that has little impact on day-to-day usage, but it is important for understanding the benchmark’s design criteria and what SPEC ultimately aims to accomplish.

Speaking of design criteria, it is worth noting that SPEC CPU is designed to be as much a CPU test as possible. Which sounds a bit tautological, but underscores how much work goes into developing benchmarks that are not tests of other aspects of computer design. SPEC CPU is not a whole system benchmark (SPEC has SPECworkstation for that), it is a benchmark of CPU and memory subsystem performance, with design choices that minimize bottlenecking elsewhere.

SPEC_CPU_Measurements
SPEC_CPU_Measurements

In particular, SPEC CPU avoids file I/O as much as possible to prevent it from being a bottleneck. The benchmarks are also written and/or modified from their original forms to eliminate non-deterministic inputs and actions (such as hardware random number generators and unstable sorts) and ensure they are deterministic. And ultimately, even the operating system is cut out of the loop as much as possible: SPEC aims for at least 95% of benchmark execution time to be spent on user-land code, so that the benchmarks do not become a proxy test for the OS.

Finally, the workloads themselves are selected to be representative of modern computer use cases, avoiding testing outdated software or algorithms of little interest outside academia. In practice, this means the SPEC CPU benchmark suite ends up looking a bit server- and workstation-focused. It is a portable and headless benchmark, so it does not include UI tasks. Still, the benchmark suite’s overall scope is incredibly broad, with SPEC CPU 2026 being the broadest yet. Among the 52 benchmarks in the latest edition of the suite, there are benchmarks for programming/compiling, media, AI, electronic design, networking, databases, and computational science workloads. The net result is that the benchmark ends up covering a very wide range of algorithms, and by extension, stresses numerous different aspects of the host CPU.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.