Intel Xeon Max 9480 Power Consumption and Cooling
We managed to get our dual Intel Xeon Max 9480 Intel developer system up to almost 1kW of power consumption at the wall when we had it loaded in a test configuration. At the same time, there is room to go up or down in terms of power consumption from there.
By far one of the biggest opportunities is HBM2e-only mode. Removing DDR5 from a system can remove a few hundred dollars (16GB DIMMs) to a few thousand of cost, but it also reduces power consumption. We saw at least 40W per socket lower power consumption or often 80-100W lower power consumption in HBM2e-only mode on a dual-socket server. Some vendors use 10W per DIMM for 160W of power savings by removing the memory.
There is, however, one use case where this can also shine. In liquid-cooled systems, one can use a cold plate to remove heat from the CPUs. Often, these same systems have DIMM cold plates to cool DDR5 memory. Removing the DIMMs from a system also removes the need to cool the DIMMs. When running in HBM2e-only mode, the heatsink or for liquid cooling cold plate is actually cooling the CPU and the memory.
That may not seem like a big deal, but folks who do liquid cooling in servers do not often speak fondly of liquid cooling DIMMs. HBM2e-only mode means one has a significant reduction in overall memory capacity, but also a much easier path to liquid cooling.
Getting Crazy with Intel Xeon MAX
At the launch of the 4th Gen Intel Xeon Scalable Sapphire Rapids the chips were realistically still in the manufacturing process so Intel likely did not push them as hard as they could have at the time. Intel’s initial batch of Xeon MAX was destined for the Aurora supercomputer that we think is likely to take the #1 spot on the November 2023 Top500 list. The impact of this is that many Xeon server buyers do not know they exist, or they think Xeon MAX is a HPC-only part. That is false.
Intel Xeon MAX is still a Xeon, and almost anything runs on Xeon CPUs. We wanted to show a crazy case that we doubt Intel has tested, so we installed Proxmox VE, a popular open-source virtualization, container, Ceph, and clustering solution built upon Debian linux. It worked immediately going through the normal installer routine, and it was running without issue without DDR5 in HBM2e-only mode.
Above, you can see that not only is the Debian base OS running, but we also have a Ubuntu virtual machine running. Again, Xeon MAX is a drop-in replacement for Xeon in many servers.
We then added DDR5 memory back in.
Here we can see our memory total is up to 256GB because the system is running in cache mode. We did not have to change any BIOS settings. We installed memory, turned the system on, and it was working.
Having seen a lot of Intel’s marketing on the Xeon MAX, this simple fact feels like it has been absent. Assuming your server can support the higher TDP and such and supports Intel Xeon MAX, one can drop it into the same server and start experiencing HBM accelerated Xeon compute without any changes. That is the power of caching mode and even HBM2e-only mode.
Of course, caching mode is more relevant here, but the point is, that both caching and HBM2e-only modes worked out of the box as a direct replacement for standard high-end Xeons.
Summing this up, the “winged” Intel Xeon MAX processors come with 64GB of HBM2e memory packaged with one 16GB HBM2e stack per compute tile.
Despite the “wings” the processors are drop-in options for many 4th Gen Intel Xeon Scalable sockets. One has the opportunity to run the Xeon MAX in either HBM2e only mode where DDR5 is not installed alongside the CPU, or with DDR5 to increase overall memory capacity.
For workloads that depend on memory performance, adding HBM2e memory to a socket can increase the performance of the system by a significant amount, whether in traditional HPC workloads, AI workloads, or even in applications not typically discussed alongside these chips. It all comes down to how effectively the HBM2e memory can be used.
Given that these CPUs are options for many servers and the fact that using them can be done transparently using default features like caching mode, they are something that we would recommend looking at if you are buying new servers. If you think you might benefit from HBM2e, then our best advice is to see if you can try Xeon MAX to see how well it works for your application, even if you plan on doing little to no traditional HPC work.