Over 2000 SPEC CPU 2017 Results Flagged for Compiler Optimization

1
SPEC CPU2017 Compiler Notes For Intel Results Accessed 2024 02 09
SPEC CPU2017 Compiler Notes For Intel Results Accessed 2024 02 09

This week, SPEC, purveyors of some of the world’s top non-AI benchmarks has gone into well over 2000 records and added a note. It seems as though the Intel compiler has optimizations for SPEC CPU 2017 integer code, and the benchmarking organization has ruled that it is not acceptable anymore.

Over 2000 SPEC CPU 2017 Results Flagged for Compiler Optimization

At this point, we have found a huge number of submissions that have this note added to their result under the “Compiler Notes” section. An example of this note, and being used is in this Dell PowerEdge R660 result:
Compiler Notes

SPEC has ruled that the compiler used for this result was performing a compilation
that specifically improves the performance of the 523.xalancbmk_r / 623.xalancbmk_s
benchmarks using a priori knowledge of the SPEC code and dataset to perform a
transformation that has narrow applicability.

In order to encourage optimizations that have wide applicability (see rule 1.4
https://www.spec.org/cpu2017/Docs/runrules.html#rule_1.4), SPEC will no longer
publish results using this optimization.

This result is left in the SPEC results database for historical reference. (Source: Spec.org)

The organization stopped short of saying there was a specific optimization for its benchmark, but it seems to imply the optimization is very narrowly applicable. For some frame of reference, this is for only one test in the overall benchmark suite. for example, in the speed results, 623.xalancbmk_s is only one of ten tests in the suite.

We can see from the compiler that this tends to be a challenge in an older version of the compilers.

C/C++: Version 2022.1 of Intel oneAPI DPC++/C++ Compiler for Linux;
Fortran: Version 2022.1 of Intel Fortran Compiler for Linux; (Source: Spec.org)

We spot-checked some newer results e.g. these from ASUS and Supermicro, and those results did not have that compiler note.

C/C++: Version 2023.2.3 of Intel oneAPI DPC++/C++ Compiler for Linux;
Fortran: Version 2023.2.3 of Intel Fortran Compiler for Linux; (Source: Spec.org)

It seems to be an issue in older benchmark runs that are now being flagged as the newer Intel compiler versions, and therefore results, are not being flagged.

Final Words

We went through around 100 of the records as a spot check and it seems to be older versions of the Intel compiler that have their results with a compiler note. The newer results, especially for chips like the 5th Gen Intel Xeon Scalable Emerald Rapids, seem like they do not have the note since they are mostly using 2023.2.3 for Intel. It seems 2022 was the generation with the optimization so it will be mostly on 4th Gen Intel Xeon Scalable results. We did not see AMD results in this list which makes sense given companies are unlikely to use Intel’s compiler for AMD CPUs.

Again, the organization says that this impacts only a single test, so the impact to the overall scores should be fairly small.

We were spot-checking results, but if others see any interesting patterns, please post them in the comments or the forums.

1 COMMENT

  1. My impression is benchmark optimisation naturally occurs as a by product of running new versions of a compiler on the same regression tests as part of the quality assurance before release. Even open source compilers such as gcc and clang end up being tuned to a specific set of regression tests.

    Unfortunately, the speed improvements on the test suite don’t always transfer to code outside the test suite.

    Whether vacuum cleaners, diesel engines, SSDs or C compilers, much of the difficulty arises when a testing agency relies on the same test over and over rather than continuously developing new tests.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.