AMD EPYC Bergamo Has Massive Consolidation Benefits

10

2024 AMD EPYC “Bergamo” Refresh Key Lessons Learned

Saying one can get 128 cores per socket may just seem like a number. Let us take a moment to put that into hard perspective. We used Supermicro’s most popular Cascade Lake Xeon (2019-2021 era) representing a common 3-5 year upgrade system target. If you had bought fifteen of those servers, the newest available 3 years ago, you can now consolidate those down to three dual-socket EPYC servers and save up to 7-8kW in rack power consumption. That 7-8kW is enough to power a current generation 8x GPU AI server. If you are doing AI projects in 2024, that is your template. Replace 15 “old” servers with 3 new servers and create the space and power for a modern AI server. For those struggling to find power and space, this is an easy template.

AMD EPYC 9754 Bergamo AMD EPYC 9684X Genoa X AMD EPYC 9654 Genoa 1
AMD EPYC 9754 Bergamo AMD EPYC 9684X Genoa X AMD EPYC 9654 Genoa 1

AMD also has other options for things that are not in the “cloud native” category. Genoa for general purpose and Genoa-X for higher performance. Bergamo benefits from being a cloud-native processor in a high-volume socket with other options. One can use the same server and customize the role of the system.

2nd Gen Intel Xeon To AMD EPYC Bergamo Generational Comparison
2nd Gen Intel Xeon To AMD EPYC Bergamo Generational Comparison

The other side is the single-socket value proposition. Server buyers purchase dual-socket servers, and there is certainly some benefit from using fewer fans, power supplies, sheet metal, and a few other components. On the other hand, managing single-socket servers is so much easier. We might swap our hosting clusters to single sockets in the next generation just because the per-socket performance and connectivity have gone up so much that it makes sense on an upgrade cycle. While these may not be the multi-GPU big servers of today, single socket cloud native is a big deal. That is probably why many cloud providers have been building single-socket cloud native server infrastructure for years.

Supermicro 2U 1P 24x DDR5 DIMM AMD EPYC 9754 Server Small
Supermicro 2U 1P 24x DDR5 DIMM AMD EPYC 9754 Server Small

At some point, folks will need to upgrade older gear. If you have 1st Gen or 2nd Gen Xeons in your environment, or perhaps Xeon E5’s, then hopefully these key lessons help guide that process. The Intel Xeon Gold 6252’s were higher performance 2nd Gen Xeon CPUs, but if you have a Xeon E5 V4 or older system, then there is a night-and-day difference.

Final Words

Overall, the AMD EPYC Bergamo is a crazy CPU. The chip pictured below has 128 cores and 256 threads. If you deploy VMs with an even number of vCPUs (very common) then this will continue to be the highest density CPU on the market even after the Sierra Forest-SP 144 core part is launched next quarter and assuming AmpereOne becomes available to purchase. Unlike Arm architectures, EPYC has enormous software support. Even if you were taking advantage of the 2nd Gen Xeon’s new VNNI feature for some AI inference capability, that is a feature that these chips still have.

AMD EPYC 9754 Bergamo 4
AMD EPYC 9754 Bergamo 4

What was a bit surprising is that using our dual socket 1U case and a roughly 5:1 consolidation ratio, there is an easy path to pulling old servers and adding AI servers into an existing data center footprint. At the same time, if you are becoming a VMware refugee due to Broadcom’s pricing changes and are transitioning to KVM or Xen-based virtualization without per-socket or per-core license restrictions Bergamo is going to have you re-evaluating previous life choices with sub-1-year payback periods.

AMD EPYC 9754 Stress For CPU Frequency 22.5hr Stress
AMD EPYC 9754 Stress For CPU Frequency 22.5hr Stress

Of course, technology marches on, and we are going to be doing a lot in the cloud-native space in 2024. Stay tuned for that sector to start emerging as a major force in the next 2-3 years.

 

Supermicro And AMD Logos
Supermicro And AMD Logos

10 COMMENTS

  1. The vast majority of companies who used Scalable Gen 1 or 2 moved on to Ice Lake and Likely Sapphire Rapids by now – but the only way AMD can come out on top is to compare something they haven’t released vs something Intel released years ago.

    AMD cannot feed 64 cores much less 128.

    I used Gen 1 which led to Ice Lake which led to Sapphire Rapids – which is the path many organizations did.

    With AMD’s launch schedule (Launch, with actual hardware months later – like how Mi300 was never mentioned as an AI processor until it shipped 9 months later and is now magically AI focused) Bergamo is a year out at least.

    If AMD was so superior then they would not be sitting in single digits in the DC.

  2. @Truth+Teller: the fact that intel DC business isn’t making money despite having 70% of the market is even more damning. It suggests that they are unable to demand a premium for their products. AMDs DC operation has been profitable for the last couple of years, growing both revenue and margins. It Does not need sherlock holmes to deduce what is going on.

  3. I’ll mention first that this is the best article I’ve seen on the cloud native subject. I wish you’d done a video of it too.

    My concern with Bergamo is that it isn’t much cheaper than Genoa or Genoa-X, but you know you’re getting less performance per core. You’re losing performance but you’re getting what 33% more cores in a socket? I don’t think that is enough of a gap. AMD needs to be offering twice the cores for cloud native to wow people into switching.

    I think you’re earlier coverage of Sierra is also right that customers won’t jump on the first gen of a new CPU line like this. You’ve also nailed the most important metric for those who will switch, how many NVIDIA H100 systems it allows you to add.

  4. I’m just commenting that I’m agreeing with Hans. I’d also say this is hands down the best Bergamo cloud native explanation I’ve ever seen and that 15 3 1 example is perfect for what we need. We’re probably going to order Supermicro H200 servers because STH seems to think they’ve got the best and they’re the only ones with a history of reviewing every gen.

  5. AMD could buy STH for $M’s just to use this article and the ROI’d be silly. You’ve just done something AMD’s marketing’s been trying to push on OEM partners and turned it from “who tf cares?” to something we’re putting on our staff meeting tomorrow.

  6. @Truth+Teller: in order of your assertions, you’re probably wrong, wrong as this article has a picture of all 128 cores 100% loaded, can’t say, somewhat right but that’s due to demand not the products coming out late and definitely wrong as their DC market share is definitely double digits.

  7. “”Truth” Teller another paid Intel shill boy……makes Bagdad Bob look like an amateur :-) and oh Bergamo has been in ample supply for quarters now and many customers had it that long already or longer,

  8. Truth+Teller your assessment is full of baloney. First most companies outside of places like AWS didn’t go from Gen 1 > Gen 2 > Gen 3 Xeon as that was a huge cost with minimal upside. The only benefit was getting onto Gen 3 or later as you could add more RAM without needing the L series CPUs. However, if you already had those well it didn’t make much sense. Secondly AMD Bergamo has been available for a long time now. I can go on Dell’s website right now and purchase a brand new system with Bergamo in it. Therefore it isn’t “a year out at least.” Also they can feed that many mores. There is a reason they went to 12 channel DDR5 RAM. A single socket now has more bandwidth than a dual socket from the DDR4 era.

  9. Are companies even looking for consolidation? I’d think most datacenters have more problems with power and cooling than available rack space. Of course having CPUs that deliver what general purpose applications need instead of focusing on AI boondoggles to woo shareholders is always appreciated. Most of the newer instruction sets are barely used.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.