AMD EPYC Genoa Gaps Intel Xeon in Stunning Fashion

21

Market Impact in 2023: AMD EPYC v. Intel Xeon

2023 will be fascinating. Let me be clear, AMD is going to do exceedingly well in the next generation. For the AMD EPYC 9654, 9654P, EPYC 9634, we do not expect Intel to field direct rivals. Likewise, the AMD EPYC 9554 and maybe the EPYC 9534 are going to be ahead of the top-bin Intel Xeon Sapphire Rapids parts.

AMD EPYC 9654 Genoa In SP5 Socket 1
AMD EPYC 9654 Genoa In SP5 Socket 1

AMD Bergamo will increase core counts. Intel’s answer for scaling out to many cores to meet the challenge presented by AMD Bergamo and various Arm competitors will be Sierra Forest, slated for 2024. When we discuss AMD EPYC Genoa, this is important. Genoa may have more cores than Intel, but it is not aiming to be the highest core count server chip and sacrifice its larger caches, nor floating point/ AI performance to do this. Genoa is a classic Intel Xeon competitor, whereas AMD Bergamo will be the scale-out core count effort.

Intel Investor Meeting 2022 DCAI Roadmap
Intel Investor Meeting 2022 DCAI Roadmap

We expect Genoa-X to blast well past 1GB/ socket of L3 cache. This will use a 3D V-Cache technology similar to what we saw with the AMD Milan-X. With four more CCDs, AMD will have 50% more CCDs to stack additional cache onto, plus there is an additional L3 cache per socket just from having more cores and CCDs. There are segments that are seeing huge benefits from larger caches, and those segments also tend to value higher core counts.

Intel’s Sapphire Rapids HBM will have more memory capacity but at a higher HBM latency versus L3 cache. That is going to be fascinating as it is likely going to be a working-set dependent where Intel and AMD will respectively come out on top.

Sapphire Rapids HBM And Patrick 2
Sapphire Rapids HBM And Patrick 2

If you have been reading STH recently or watching the STH YouTube, you will probably have seen a lot on the upcoming accelerators in the Intel Xeon Sapphire Rapids generation. We were able to see that things like the Intel QuickAssist accelerator(s) take up very little die space yet yield very solid performance gains.

Intel Pre Production Sapphire Rapids Preview QAT Nginx HTTPS Performance Per Thread Preview
Intel Pre Production Sapphire Rapids Preview QAT Nginx HTTPS Performance Per Thread Preview

We did not get to re-run these just due to time on the Genoa platform, but AMD will make some significant inroads here.

The key for Intel is that it needs to drive rapid adoption. I have been using QAT since 2016 (first hardware in 2013), and yet it is still nowhere near mainstream. Putting features like QAT in mainstream Xeon is key to adoption.

AMD EPYC 9004 Genoa With Milan Rome Intel Xeon Ice Lake Sapphire Rapids 13th Gen Core Ampere Altra Max 3
AMD EPYC 9004 Genoa With Milan Rome Intel Xeon Ice Lake Sapphire Rapids 13th Gen Core Ampere Altra Max 3

This is where Intel needs a strategic course correction and a hard one to make at that. The current plan is that Intel will offer accelerators on many of its chips. On lower-end SKUs, it is working with major OEM partners to enable an on-demand model so these accelerators can be used. Intel’s challenge is simple to appreciate. It has to work with OEMs to allow the enablement of accelerators. On the flip side, we do not, at this point, expect Intel will have a substantial performance per core advantage over AMD, if any at all, save for using its accelerators. Intel is stuck between the payoff for OEMs with the on-demand acceleration model and AMD having a bigger CPU. There is still time, but if Intel does not do this, then its last resort is to compete on price.

One other important bit is that we do not expect all Sapphire Rapids parts to have a full set of accelerators as we tested on the high-end Sapphire Rapids SKUs. Intel’s top-end SKU stack and middle-to-lower-end SKUs will be very different from what AMD is offering.

That brings us to a perhaps more impactful point. 2022 has seen a massive cliff in client PC demand. AMD’s strategy of building its Zen 4 CCD, then effectively leveraging the (more or less) same design on EPYC and Ryzen means that AMD has a conceptually clearer path to shifting resources to support Genoa growth than Intel does with Intel’s similar cores but different die approach to client and server.

AMD EPYC 9654 Genoa In SP5 Socket 2
AMD EPYC 9654 Genoa In SP5 Socket 2

Perhaps the best case is that the AMD EPYC Genoa will cause a shake-up on the Intel Xeon side. It is tough for an organization designed to service a 97%+ market share to have to go into underdog mode.

Final Words

Perhaps the question at this point many are asking is simply, “is AMD EPYC Genoa any good?” The answer is clearly yes. AMD has pushed headfirst into the new era of servers with a very straightforward approach. Intel for its part has looked at what AMD is doing, and decided to go down a very different path. Assuming companies continue to buy servers, AMD will aggressively gain share in this generation. We have not spoken to anyone in the industry, at Intel included, that we have heard talk about Intel gaining lost market share back from AMD in the Genoa v. Sapphire generation.

AMD EPYC 9554 EPYC 9654 And EPYC 7374F Genoa 1
AMD EPYC 9554 EPYC 9654 And EPYC 7374F Genoa 1

This launch puts STH in a bit of an awkward position. We have two very similar QCT 2U servers, both with AMD EPYC Genoa and Intel Sapphire Rapids. We cannot publish the name or clock speed of the 60-core Sapphire SKUs we have, but we will say that we expect these to be higher-end parts. We do not expect Intel to be able to make 60-core versus 96-core comparisons without accelerators. Still, there are many market segments, and we expect Intel to be much more competitive outside of the race to the maximum number of cores per socket.

AMD EPYC 9004 Genoa 2P QCT 2
AMD EPYC 9004 Genoa 2P QCT 2

Still, at the high-end, one thing is for certain. AMD EPYC Genoa will put a gap between it and the Intel Xeon Sapphire Rapids launching in two months.

21 COMMENTS

  1. $131 for the cheapest DDR5 DIMM (16GB) from Supermicro’s online store

    That’s $3,144 just for memory in a basic two-socket server with all DIMMs populated.

    Combined with the huge jump in pricing, I get the feeling that this generation is going to eat us alive if we’re not getting those sweet hyperscaler discounts.

  2. I like that the inter CPU PCIe5 links can be user configured, retargeted at peripherals instead. Takes flexibility to a new level.

  3. Hmm… Looks like Intel’s about to get forked again by the AMD monster. AMD’s been killing it ever since Zen 1. So cool to see the fierce competitive dynamic between these two companies. So Intel, YOU have a choice to make. Better choose wisely. I’m betting they already have their decisions made. 🙂

  4. Do we know whether Sienna will effectively eliminate the niche for threadripper parts; or are they sufficiently distinct in some ways as to remain as separate lines?

    In a similar vein, has there been any talk(whether from AMD or system vendors) about doing ryzen designs with ECC that’s actually a feature rather than just not-explicitly-disabled to answer some of the smaller xeons and server-flavored atom derivatives?

    This generation of epyc looks properly mean; but not exactly ready to chase xeon-d or the atom-derivatives down to their respective size and price.

  5. I look at the 360W TDP and think “TDPs are up so much.” Then I realize that divided over 96 cores that’s only 3.75W per core. And then my mind is blown when I think that servers of the mid 2000s had single core processors that used 130-150W for that single core.

  6. Why is the “Sienna” product stack even designed for 2P configurations?

    It seems like the lower-end market would be better served by “Sienna” being 1P only, and anything that would have been served by a 2P “Sienna” system instead use a 1P “Genoa” system.

  7. Dunno, AMD has the tech, why not support single and dual sockets? With single and dual socket Sienna you should be able to be price *AND* price/perf compared to the Intel 8 channel memory boards for uses that aren’t memory bandwidth intensive. For those looking for max performance and bandwidth/core AMD will beat Intel with the 12 channel (actually 24 channel x 32 bit) Epyc. So basically Intel will be sandwiched by the cheaper 6 channel from below and the more expensive 12 channel from above.

  8. With PCIe 5 support apparently being so expensive on the board level, wouldn’t it be possible to only support PCIe 4 (or even 3) on some boards to save costs?

  9. All other benchmarks is amazing but I see molecular dynamics test in other website and Huston we have a problem! Why?

  10. Looks great for anyone that can use all that capacity, but for those of us with more modest infrastructure needs there seems to be a bit of a gap developing where you are paying a large proportion of the cost of a server platform to support all those PCIE 5 lanes and DDR5 chips that you simply don’t need.

    Flip side to this is that Ryzen platforms don’t give enough PCIE capacity (and questions about the ECC support), and Intel W680 platforms seem almost impossible to actually get hold of.

    Hopefully Milan systems will be around for a good while yet.

  11. You are jumping around WAY too much.

    How about stating how many levels there are in CPUS. But keep it at 5 or less “levels” of CPU and then compare them side by side without jumping around all over the place. It’s like you’ve had five cups of coffee too many.

    You obviously know what you are talking about. But I want to focus on specific types of chips because I’m not interesting in all of them. So if you broke it down in levels and I could skip to the level I’m interested in with how AMD is vs Intel then things would be a lot more interesting.

    You could have sections where you say that they are the same no matter what or how they are different. But be consistent from section to section where you start off with the lowest level of CPUs and go up from there to the top.

  12. There may have been a hint on pages 3-4 but I’m missing what those 2000 extra pins do, 50% more memory channels, CXL, PCIe lanes (already 160 on previous generation), and …

  13. On your EPYC 9004 series SKU comparison the 24 cores 9224 is listed with 64MB of L3.
    As a chiplet has a maximum of 8 cores one need a minimum of 3 chiplets to get 24 cores.
    So unless AMD disable part of the L3 cache of those chiplets a minimum of 96 MB of L3 should be shown.

    I will venture the 9224 is a 4 chiplets sku with 6 cores per chiplet which should give a total of 128MB of L3.

  14. Patrick, I know, but it must be a clerical error, or they have decided to reduce the 4 chiplets L3 to 16MB which I very much doubt.
    3 chiplets are not an option either as 64 is not divisible by 3 😉

    Maybe you can ask AMD what the real spec is, because 64MB seems weird?

  15. @EricT I got to use one of these machines (9224) and it is indeed 4 chiplets, with 64MB L3 cache total. Evidently a result of parts binning and with a small bonus of some power saving.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.