How Liquid Cooling Servers Works with Gigabyte and CoolIT

12

Performance

In terms of performance, we saw a small but notable performance bump over our reference air-cooled systems. Liquid cooling solutions do not suffer from the same heat soak challenges many air-cooled servers have.

Gigabyte H262 ZL0 With CoolIT AMD EPYC 7713 Compared To Air Baseline
Gigabyte H262 ZL0 With CoolIT AMD EPYC 7713 Compared To Air Baseline

Somewhat by design, and somewhat guided by the fact that STH does not have eight AMD EPCY 7763’s (and I picked up the AMD EPYC 7773X‘s up the day after I got back from Calgary) we used different 64 core EPYC CPUs in the nodes. The overall pattern was the same in that the liquid-cooled processors were able to maintain turbo clocks longer and therefore had slightly better performance, especially when running AVX2 workloads.

Gigabyte H262 ZL0 With CoolIT AMD EPYC 7763 Compared To Air Baseline
Gigabyte H262 ZL0 With CoolIT AMD EPYC 7763 Compared To Air Baseline

Performance is probably not the main reason to go liquid cooling. Instead, it is really the economics and how it changes the data center.

The Bigger Impacts

In the video, you are likely going to see two much larger impacts than just the raw performance. Near the end, we show this view. While it may seem like I am standing in a restroom (I am), that is not the point of the segment.

Garden Hose Cooling 20000 AMD EPYC Cores From The Restroom With Patrick And Flow
Garden Hose Cooling 20000 AMD EPYC Cores From The Restroom With Patrick And Flow

This 3/4″ pipe with the garden hose is going out to the test setup’s CoolIT CHx80 CDU. The return with warmed water from the CDU is exiting through the hose nozzle I have in my hand and that is going to the sink. Normally this would go out to the chiller outside the building, but we had to run it like this for a few minutes, and I just wanted to see what a garden hose worth of water could cool. I think many of our readers understand roughly the amount of flow rate through a typical garden hose.

The CDU ran the calculations when we fired up the Gigabyte H262-ZL0 and we found that the flow rate from this garden hose was enough to cool all 80kW of capacity the CHx80 can handle. We are cooling ~2kW of CPUs so the CHx80 cooling 80kW is roughly 40x Gigabyte H262-ZL0’s. Since we had 512 cores in that 2kW cooling envelope, that means the garden hose I have attached to the bathroom spigot is able to cool over 20,000 AMD EPYC 7003 “Milan” cores. This is even without recycling the water with a chilling process as one would normally do.

Let that sink in for a moment (many puns today.) 20,000 current-generation AMD EPYC cores cooled via a garden hose. In a 2U 4-node server we often see fans take 20% of system power that is replacing roughly 16kW of fan cooling per hour on 40x 2U 4-node servers. Even in Texas where power is inexpensive, that is an enormous saving. In some more expensive electric jurisdictions that pays for itself very quickly especially when the water goes through a cooling process, also saving the environment.

Final Words

Hopefully, in our Gigabyte H262-ZL0 review we showed how and why liquid cooling is a big deal for a system like this. It is frankly hard to cool 280W TDP CPUs in 2U 4-node servers without liquid cooling, and next-generation CPUs are going to jump TDPs massively. Liquid cooling will be required on dense systems starting later in 2022.

Gigabyte H262 ZL0 Demo With Rack Manifold And CoolIT Systems CHx80 3
Gigabyte H262 ZL0 Demo With Rack Manifold And CoolIT Systems CHx80 3

Since I set up everything, I can tell you that the loop was not difficult to assemble. Even with my propensity to break things the Gigabyte H262-ZL0 did not leak (the “L-word”), even when running on its side which it is not designed to do.

Gigabyte H262 ZL0 Demo With Rack Manifold And CoolIT Systems CHx80 2
Gigabyte H262 ZL0 Demo With Rack Manifold And CoolIT Systems CHx80 2

Regular STH readers will notice we did not go into the MegaRAC SP-X management, nor our STH Server Spider as we normally do. Frankly, we have covered the Gigabyte management solution many times, and to me, the impact of this server is really that it is liquid-cooled.

Gigabyte H262 ZL0 Node Next To PCL 2
Gigabyte H262 ZL0 Node Next To PCL 2

If you are interested in liquid cooling, I highly suggest watching the video since that goes into some “show me” aspects that we cannot do as easily in a web format. Our audiences for YouTube and the STH main site are almost completely separate, by design. This is one case where it is worth taking a look even if you have read the review. In the next few weeks, we are going to have a second piece where we look at the CoolIT Liquid Lab in more detail and how these systems are tested to ensure they are reliable to deploy. That felt like a different topic, so we stayed a second day in the lab and did a piece on that.

Gigabyte H262 ZL0 Node Next To PCL 6
Gigabyte H262 ZL0 Node Next To PCL 6

Overall, those looking at the Gigabyte H262-ZL0 are in search of density, since that is a primary reason to utilize the 2U 4-node form factor. Hopefully, this article explains why using liquid cooling is necessary for higher-TDP CPUs that are coming, and how it saves an enormous amount of energy in the process. Helped a bit no doubt by ~8C inlet water temps in the Calgary winter, we were able to cool 20,000 AMD EPYC 7003 cores in 40x Gigabyte H262-ZL0 servers using only a garden hose worth of facility water. That is absolutely awesome and is why liquid cooling is going to become common in high-density systems going forward.

12 COMMENTS

  1. Per the server cooling concepts shown in this article, my 2012-2021 gig as Sys Admin / Data Center Tech with the last few years immersion cooling infused, upon testing a system like the one shown here myself, I felt I was drifting towards being a Sys Admin / Data Center Tech / Server Plumber.

    …Seeing my circa 2009 air-cooled racks with less and less servers-per-rack over time, and STH’s constant theme of rapidly increasing server CPU TDPs, data center techs need to get used to a future that includes multiple forms of liquid cooling.

  2. This begs only one question: How much energy and water use for data centers (and coming soon/arrived to home users with the next gen nvidia gpus and current course of cpus) will break the camel’s back?
    While it’s great that everything has become more efficient, it has never lead to a single decrease in consumption. A time during which energy should be high on everyone’s mind for a myriad of reasons and the tech hardware giants just keep running things hotter and hotter. I’m sure it will all be fine.

  3. Amazing article. I’m loving the articles that I can just show my bosses and say ‘see…’

    Keep em coming STH

  4. Patrick, next time you get your hands on something like this, could you try and see if you could open one of the PCL’s coolers. As a PC watercooling enthousiast, I’m curious to see whether the same CPU cooler principes apply. Particularly considering things like flow rate and the huge surface area of EPYC CPU’s.
    Furthermore, it might be funny to compare my 4U half-ish depth water to air “CDU” with flow rate and temperature sensors with the unit you demonstrated in this review. Oh, my “CDU” also contains things like a CPU, GPU, ATX motherboard šŸ˜‰

  5. How much power would the chillers consume though? This is definitely a more efficient way to transfer the heat energy somewhere else. Iā€™m not so sure about the power savings though.

  6. Bill, it depends, but I think I just did back of the envelope on if you did not even re-chill the water and just passed water through adding heat. It was still less than half the cost of using fans. Even if you use fans, you then have to remove the heat from the data center and in most places that means using a facility chiller anyway.

  7. Would have been interesting to know how warm the liquid is going in and out. Could it be used to heat an adjacent office or residential building before going to the chiller?

  8. With high-density dual-socket 2U4N modules, for example, there seems to be one socket which gets the cooled first and the exhaust of that cooling used to cool the second.

    If I understood the plumbing correctly the water which cools the first socket in the setup tested here is then used to cool the second socket. RAM is air cooled but again the air is passed over the first group of DIMMs before cooling the second.

    Under load I’d expect there to be a difference in the temperature of the first compared to the second. In my opinion measuring this difference (after correcting for any variations in the chips and sensors) would be an meaningful way to compare the efficiency of the present water cooling to other cooling designs in similar form factors.

  9. How about the memory cooling, storage, and network? You still need the fans for that in this system.
    With DDR5 memory coming up and higher power there will be a much smaller savings from a cpu only liquid cooled system.
    Any chance to publish the system temperatures too?
    Thanks!

  10. Mark, mentioned this a bit, but there is another version (H262-ZL1) that also cools the memory and the NVIDIA-Mellanox adapter, we just kept it easy and did the -ZL0 version that is CPU only. I did not want to have to bring another $5K of gear through customs.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.