With the launch of the Intel Xeon Scalable Processor Family, we wanted to take a moment to discuss the platform at a high-level. We have a ton of coverage on the new platform which you can access via our Skylake-SP Coverage Central. At the same time, we wanted to look at the overall platform implications in a higher-level piece while many of our other articles are in-depth analysis pieces.
Intel Xeon Scalable Processor Family Platform Overview
Intel provided this overview slide which we found to be extremely useful for describing the platform. Intel has removed the Xeon E5 versus Xeon E7 differentiation and now supports scaling to 4P and beyond configurations with relative ease. There is a new socket, new chipset, new NVMe storage support, optionally integrated fabric, more cores, more memory channels, and new security features. Suffice to say, the platform is meant to simply be all-around bigger than the Intel Xeon E5 family ever was.
The Intel Xeon Scalable Processor Family improves upon previous generation Intel Xeon E5 and E7 V4 generations in just about every metric. Here is Intel’s slide on the talking points:
With the platform scaling to 4 and 8 sockets (with high-end SKUs) and by adding more I/O and memory bandwidth, it was time to replace QPI. QPI was the main intrasystem link that Intel used to move data between CPUs in the Intel Xeon E5 era. Replacing QPI is the Intel Ultra Path Interconnect (Intel UPI):
The key here is that Intel is using new encoding and speeds to reach higher inter-socket bandwidth than it had before.
The new Intel Xeon Scalable CPUs support 2 UPI or 3 UPI links per CPU. Intel can support a lower performance, lower cost 4S design (Intel Xeon Gold 5100 series) with 2 UPI links or higher performance 4S and up to 8S design with CPUs that feature 3 UPI links. We have more details of how these links interface with the CPUs in our Intel Mesh Interconnect Architecture piece.
Intel Skylake-SP Six Channel RAM
Supporting the new CPUs is six channels of 2 DIMM per channel RAM. That means that each CPU can have up to 12x DDR4 2666MHz DIMMs attached to it. Here is an example of an ASRock Rack EP2C622D12NM-4L motherboard using 6 DIMMs per CPU (1 DIMM per channel operation):
Many customers are accustomed to having 8 DIMMs per CPU so we are seeing some designs with two additional DIMM slots per CPU to address that need such as this ASRock Rack EP2C6222D16FM motherboard.
To show 12 DIMM dual channel operation, here is a look inside a Tyan GA88-B5631 1U GPU compute system. You can see the single socket flanked by 12 DIMMs.
The net impact here is that memory bandwidth is increasing significantly. There are two additional channels for an immediate 50% boost. Furthermore, the new systems will support DDR4-2666 for a ~10% speed boost per channel over DDR4-2400. Combined, here is what the memory bandwidth picture looks like:
The New LGA3647 Socket P
Socket P is a LGA3647 socket. Just to get a sense of scale, here is what an Intel Xeon E5-2600 V4 CPUs looks like above a LGA3647. You will notice the new CPUs are much larger.
LGA3647 may sound familiar to STH readers. It is the same pin count socket as Intel Xeon Phi x200 series (Knights Landing) used.
One fun trivia fact, this image that we published in November 2016 as part of our Big Sockets: The monstrous Intel LGA 3647 socket and package piece actually featured an early Skylake-SP chip, not a Xeon Phi x200 chip. Many hundreds of thousands of people actually saw Skylake-SP much earlier than they thought (and not a single person noticed.) This does provide a great comparison of the chip sizes:
Despite using the same pin count you cannot necessarily share components with the Xeon Phi parts. Notches on the packages and sockets are different. Furthermore, the stock cooling mounting mechanisms are different between the two platforms. A lesson we learned trying to use an Intel Xeon Phi heatsink with a Skylake-SP CPU and motherboard.
Gone are the traditional latching mechanisms. Instead, one uses a clip to attach the CPU to the heatsink. Then one places the heatsink and CPU assembly on two guide pins lowering it into the socket. Finally, one screws the heatsink down using four star screws into the socket. If this sounds complicated, it is.
To install the heatsink it is a similar procedure to the Xeon Phi x200 setup we showed here.
Although we only have around 60x LGA3647 installations under our belts as compared to hundreds (or more) of LGA2011(-3) installations, the new socket is significantly scarier to work with.
New Platform Level RAS Features
Since the new Intel Xeon Scalable Processor family is replacing the Intel Xeon E7 line, we are seeing a convergence in terms of RAS features. Where the Xeon E5 line was a tier below the E7, now we have the Intel Xeon Platinum and Gold tiers getting the enhanced Intel Run Sure technologies.
Many of the enhanced Run Sure features that are not present on the lower-end Silver and Bronze CPUs make sense.
Features like memory mirroring will not be used in low “light the platform” deployments. The Intel Xeon Bronze CPUs are closer to the Intel Xeon E5-2609 V4 chips in terms of segmentation so it is unlikely folks will use that low end of chip with a huge memory overhead.
Head over to the Intel Xeon Scalable Processor Family Platform Launch Coverage Central to find more in-depth coverage. The new Scalable Processor family is going to be a major departure from what we have been accustomed to over the past few years. Furthermore, Intel now has competition in the x86 space. AMD EPYC has 8 memory channels and more I/O lanes. There are some platform nuances here as the design approaches differ, but Intel does have real competition in the 1-socket and 2-socket markets.