Cinebench is a widely cited rendering benchmark, especially in the desktop space. Frankly, it is not the best benchmark for servers. Still, STH has run a small series for the past few years we call “Crushing Cinebench” that started with a quad Intel Xeon E7 18-core system years ago. In 2017, we had versions with quad Intel Xeon Platinum 8180’s and dual AMD EPYC 7601 CPUs. Recently, we uploaded our Crushing Cinebench V5 video with a world record dual AMD EPYC 7742 system. We wanted to talk a bit about the runs.
Crushing Cinebench V5
Here is the actual video we are going to discuss. It is under 3 minutes and shows a few key points of the process:
Now we are going to chat about the Cinebench R15 and Cinebench R20 results.
Dual AMD EPYC 7742 Cinebench R15 11080
This one is perhaps one of the worst benchmarks one can run. Cinebench R15 finishes very quickly. On dual AMD EPYC 7742 CPUs we saw a score 11,080.
That is not actually the best we have ever seen. In Crushing Cinebench V4.5 we saw that the quad Intel Xeon Platinum 8180 system could achieve a score of 11,584 on the benchmark. We also noted that there was enormous variability of 20% in run results as the benchmark seemed to be running near its limit. That caused us in 2017 to declare that Cinebench R15 is Now Broken as a Benchmark.
The dual AMD EPYC 7742 result is impressive since it was using $14,000 of CPUs v. $40,000 of Intel Platinum 8180’s. It also uses half the sockets and around half the power.
Dual AMD EPYC 7742 Cinebench R20 31833
You can see the current (as of this writing) world record on HWbot.org. Undoubtedly, this will be broken soon. We did not set out to set a world record. We actually did the runs to test a Windows Server 2019 patch for having two 128 thread CPUs, and just wanted to compare results. We noticed it was a world record and therefore submitted it.
The Cinebench R20 31,833 run can be seen here:
During the process, we tested a limit of Cinebench R20. When Cinebench R20 was released a few months ago, it was supposed to be designed as a more future proof benchmark for up to 256 threads. Two quarters later, you can buy a dual-socket system for under $20,000 that can hit that limit.
We saw a number of runs that had the exact same scores. That seemed strange as one would expect more variability due to clock speed changes. We also saw that the Cinebench R20 renders are taking well under 8 seconds. That is important because the runs end well before modern systems hit thermal throttling rates on a single run.
This is going to get broken, and likely soon. As the Windows Server 2019 scheduler gets patched for AMD’s CPUs and support matures, that will help a lot. As we covered with our in-depth AMD EPYC 7002 Series Rome Delivers a Knockout article, AMD has leapfrogged Intel in core counts and the ecosystem needs to adapt. We did not do any optimizations on our system to set these records, nor did we spend a ton of time on it. We just ran a handful of runs to validate Windows Server 2019 works with the higher core count AMD EPYC 7742 parts.
What is clear here is that Cinebench R20, if it wants to cope with CPUs of 2020, is nowhere near scalable enough. We are going to need to see a 4x problem size improvement at minimum to stay relevant for high-end CPUs through early 2021. We are already seeing limits of only 256 thread support only a few months after R20’s launch