Something that many of our readers may not recall is that the upcoming Intel Xeon Sapphire Rapids CPUs will have a feature we have not seen from Intel since the 2020-2021 Cooper Lake 3rd Generation Scalable CPUs, and that is Intel will have new 4 and 8 socket offerings with Sapphire Rapids. Intel confirmed this some time ago, but it re-confirmed it in the recent Intel Innovation 2022 materials.
Intel Xeon Sapphire Rapids to Scale to 4 and 8 Sockets
We have an early look at the platform overview for the new 4th Generation Intel Xeon Processors codenamed “Sapphire Rapids.” One should not confuse 4th Gen Xeon with Xeon E5 V4 or Xeon E7 V4, because it looks like “Scalable” is being dropped from the next-generation branding. One can see in the second column at the very top that there is “1 to 8 socket support”.
This is current as of the end of September 2022 for the Sapphire Rapids platform. We see features like accelerators that we will discuss more soon, but also why we have been working on pieces like Intel QuickAssist in Ice Lake Servers What You Need to Know. We also see things like CXL support, 8-channel DDR5, UPI, and PCIe Gen5 support.
For STH readers, the support for 4 and 8-socket Sapphire Rapids servers will make a lot of sense. In our 2020 piece: The 2021 Intel Ice Pickle, How 2021 Will be Crunch Time we showed Intel’s planned 2P and 4P+ platforms and when they launched or were expected to launch.
We will note that the Sapphire Rapids timeframe has shifted as it was not launched in late 2021.
Here is the video for that one:
There is a class of systems that has 4-8 sockets and has been stuck for some time. Intel had 4-8 socket support with Skylake and Cascade Lake (1st and 2nd-gen Xeon Scalable.)
With the 3rd Generation Intel Xeon Scalable, there was both Cooper Lake and Ice Lake. Most of Intel’s 3rd Generation comparisons will be to Ice Lake, but Cooper Lake is technically branded as 3rd Gen as well. In the 4 and 8 socket space, Intel can show massive performance gains with >2x the cores per socket plus two generations of IPC improvements, six-channel DDR4 to eight-channel DDR5 per socket, new instructions, and new accelerators.
It is unlikely that Intel is going to be competing on a cores per-socket basis with AMD in this generation. At the same time, we actually expect Intel to have the highest core count servers of the next generation when we look to 4-8 sockets. AMD Genoa has publicly stated 96 cores per CPU, and Bergamo at 128. Intel has stated it plans 60 core SKUs for 4x 60 = 240 cores or 8x 60 = 480 cores per server.
With that said, we see trends to single or dual-socket servers instead of eight-socket servers. AMD may have the most cores in single and dual-socket servers. Intel technically may end up with the highest core count servers of this generation and the most aggregate memory bandwidth and PCIe connectivity just by using more CPUs. It also has a segment of customers that is about to see an enormous gen/gen improvement over Cooper Lake.
Having a higher socket count is one way of side stepping the per socket limitations. From a pure hardware perspective, this is actually a smart move as it also increases memory capacity and IO per node. The problems is price in both hardware and software. Intel traditionally has charged a premium for quad+ socket capable chips and the boards/systems they reside in are also a tier above dual socket pricing. Some of that makes sense due to the complexities of additional components but a fair chunk of it is pure product segmentation profit. The software side is the real barrier as legacy licensing models have incurred a socket multiplier. What do use 4 sockets instead of 2? Double the licensing fee even if core counts remained the same. The more modern way of licensing is per core with a lower socket multiplier which does make larger socket systems more attractive due to their other benefits.
There are also a few other concerns with larger socket systems: while socket count increases, density may not due to physical space and cooling requirements. We’re heading into an era of 400W processors. It may simply be wiser to leverage two 200 W chips instead of a single 400W unit if the current air cooled infrastructure can be maintained. While it is easier to cool two 200W chips vs. a single 400W unit, the base idle power is also higher. At the same core counts/clock speeds, a dual socket system will out perform a quad socket due to better on-chip coherence traffic. Core-to-core communication is simply faster on-package.
I don’t see it that way. Yes. more sockets is better , but only in purely abstract mathematical world, where there are no losses, friction and where speed of light is infinite.
More socket means more connective links. Which have to be fast. And churn good amount of energy.
And above all, which grow at a rate near the second order of node number.
And above that, keeping coherence across that area gets massively more expensive in all terms: money, energy, silicon area AND time.
Coherence is great as long as it costs nothing. When it starts costing something, it had to be traded off.
What exactly do you need it for and how much do you need ?
Core counts are massive as they are and keeping coherence can be a challenge even within one socket.
When one’s needs grow above that, one would be wise to ask himself whether prboem could be covered through Infiniband or better yet, CXL.
@Gertrude > “What exactly do you need it for and how much do you need ?”
Using an online configurer a minimum system for SAP HANA (8 Socket x 28 cores, 24 TB) looks like:
“Your current configuration
MP SuperServer SYS-7089P-TR4T
8 x Intel Xeon Platinum 8280L Processor
96 x 128GB DDR4-3200 ECC REG
1 x 240 GB Intel SSD D3-S4520
1 x OOB Management Package
1 x No virtualization
1 x No operating system
1 x 3 year Extended Warranty
5 x Power cord 0.9 m
Current configuration Price: 259529.57 €”. (plus another 100K for base software, without the additional millions for HANA)
The big ticket items are:
8 x Intel® Xeon® Platinum 8280L Processor 28-cores / 56-threads 2.70GHz 38.50MB Cache (205W) 16888.72 EUR (x8=135,109.76)
96 x 128GB DDR4-3200 4Rx4 (16Gb) ECC Registered DIMM 957.92 EUR (x96=91,960.32)
Gertrude, the reason people pay that is because that’s how much the solution costs, if you have a better and cheaper solution then you have a great business plan and are probably too busy to reply.