My Challenge to Intel
At the event I gave our PR contact some simple feedback: we need a better story. Here is why. VNNI as a “new” instruction set for mainstream is going to take some time to gain broad software adoption. Furthermore, most expect the majority of inferencing to happen at the edge so putting VNNI in AWS, GCP, Azure, Ali Cloud, and others is not going to necessarily help inferencing adoption across all applications that NVIDIA is already targeting with CUDA. Remember, NVIDIA now has Tensor Cores in its Xavier platform for robots and self-driving cars, Voltas for the data center, and RTX cards for consumer desktops. That is a fairly wide range of application scenarios compared to a Cascade Lake Xeon by Q4 2018.
Just about everyone in the industry thinks that Intel Optane Persistent Memory (OPM) is going to be big. If I can get 128GB OPM DIMMs for $400-500, STH will be running on OPMoF by next year without question. Intel has not released pricing, but we also do not expect every Cascade Lake-SP server to ship with OPM. As a result, OPM is only a solution for a portion of the market that will be early adopters. For those with large in-memory database and analytics workloads, we expect this to be a game changer. For others, it may take some time for software to catch up with hardware. This is normal.
For those who are doing inferencing on GPUs or other accelerators, or for those who are not going to adopt OPM widely in this generation of deployments, the real performance benefits are likely to be a small MHz improvement on SKUs, potentially some SKU-levels moving to higher core counts below the 28 core maximum, and then performance more in line with the pre-side channel attack world.
Security is undoubtedly important. There is also a chance that Cascade Lake-SP has fixes for yet undisclosed vulnerabilities as well. If we are going to put a stake in the ground and say “this is where Moore’s law died”, Cascade Lake-SP will be it. Here are two charts. The first is the real generational comparison between maximum core counts of each generation. We are leaving Sandy Bridge out as there was no real “Beckton” successor until Ivy Bridge.
The second is the real generational comparison between maximum core counts of mainstream CPUs, and assuming the 28 Xeon Platinum parts are “mainstream.”
As you can see, that black bar tracking the rate of change has hit zero for Cascade Lake-SP. We have been accustomed to a nominal performance increase plus core increases in each generation, now we are getting the nominal performance increase (save for the scenarios with hardware fixes for side channel) and no core count increase.
Either way, the narrative we need from Intel is more than just bugfix plus bleeding edge features. Most Intel Xeon shops will continue to buy Intel Xeon, getting VNNI in the process. We need a narrative for the 10-30% of customers who will contemplate AMD EPYC and its second generation, much improved, and bigger 2019 “Rome” offering. To those folks, Intel is at risk of having no good answer until Cooper.
To Intel, VNNI, OPM, and security fixes are engineering feats. For Intel’s channel partner community, they will need a better story in 2019 to extole their benefits.
In the context of Hot Chips, Akhilesh Kumar did a valiant job of getting out the new Cascade Lake-SP disclosures. For the marketing teams at Intel, in the coming months, the less server focused press will come up with the charts shown above and proclaim the death of Moore’s law and Intel’s ability to drive more than nominal improvement every 18 months. The reality is somewhat different as hardware side channel fixes mitigating performance impacts may have a huge influence on performance, as we saw with Intel Publishes L1TF and Foreshadow Performance Impacts and Intel Circles Back on Meltdown and Spectre Initial Fixes Pushed.