Why DDR5 is Absolutely Necessary in Modern Servers

19

1.1V, 5V, 12V, and Why do Servers No Longer Support Unbuffered ECC Memory?

This is one that many folks are still unaware of. In the DDR3 and DDR4 generations, generally, one could use either unbuffered ECC (and often non-ECC) memory or RDIMMs in servers. That is no longer the case. For this, let us take a look at the AMD EPYC 9004 Genoa Memory capabilities slide:

AMD EPYC 9004 Genoa DDR5 Memory Capabilities
AMD EPYC 9004 Genoa DDR5 Memory Capabilities

Here we will notice that the only two types of memory listed as supported are RDIMM and 3DS RDIMM. There is no UDIMM support. As a quick note, there are x72 DIMMs being mentioned on this slide. Those are going to be for some very specific customers and are not general-purpose modules that most of our readers would put into a system, so we are not going to cover them in this article.

Intel, for its part, is saying the same thing as AMD on supporting only RDIMMs, not UDIMMs.

Micron DDR5 UDIMM Front
Micron DDR5 UDIMM Front

The key reason for this is that the power delivery components are now on the DIMMs themselves. In servers, 12V power is supplied. In client systems, only 5V. This is converted to 1.1V for the DIMM and managed by the onboard power management IC, or PMIC. The PMIC moves a motherboard function to the DIMMs, but that means we have an added component on the modules.

Micron DDR5 RDIMM Front
Micron DDR5 RDIMM Front

Many of the changes to the power subsystem had to happen to handle the higher clock speeds that DDR5 aims to achieve. One can see that the physical keying of the modules is now different in order to prevent inserting the wrong modules into a system. In the DDR4 generation, one could use UDIMMs in servers that supported RDIMMs. That is no longer true with DDR5.

One DIMM Two Channels

Perhaps one of the biggest changes in the DDR5 generation is the channel architecture. DDR4 DIMMs have a 72-bit bus. That bus may be referred to as 64+8, which means there are 64 data bits plus 8 ECC bits.

Micron DDR4 RDIMM One 72 Bit Channel
Micron DDR4 RDIMM One 72 Bit Channel

With DDR5, there are now two channels with a total of 80 bits. Each channel is half of that or 40 bits with 32 bits for data and 8 bits for ECC. The channels are split on the DDR5 DIMM with one channel on the left and one on the right.

Micron DDR5 RDIMM Two 40 Bit Channels
Micron DDR5 RDIMM Two 40 Bit Channels

To take better advantage of this, the new DDR5 DIMMs have a longer burst length of 16. This allows each channel to access 64 bytes of information for CPU cache lines, and do so independently. In turn, that adds parallelism to the memory subsystem operating over two channels per DIMM instead of one.

Next, we are going to take a look at the chips on the DDR5 RDIMMs, and what is different in this generation.

What are the Big Chips on DDR5 DIMMs?

Perhaps the most important feature is the DRAM itself. On the ECC DIMMs we are showing both RDIMMs and ECC UDIMMs, we have five chips on either side:

Micron DDR5 RDIMM DRAM Packages
Micron DDR5 RDIMM DRAM Packages

On similar consumer non-ECC UDIMMs, one may see only four instead of five here. These DRAM packages provide the memory capacity for the modules, but they are not the only components.

In the middle of the new DDR5 DIMMs on either side. On one side, you will find the RCD or register clock driver. This is responsible for providing the clock distribution to the different chips on the memory module.

Micron DDR5 RDIMM RCD Rambus
Micron DDR5 RDIMM RCD Rambus

This is not present on the UDIMM version of the DDR5 DIMM:

Micron DDR5 UDIMM Front
Micron DDR5 UDIMM Front

The common ECC and non-ECC SODIMM form factors also do not have RCDs.

Micron DDR5 UDIMM Rear
Micron DDR5 UDIMM Rear

On the other side, you are likely to see the PMIC or power management IC. This is responsible for managing power on the device.

Micron DDR5 RDIMM PMIC
Micron DDR5 RDIMM PMIC

There will also be a SPD hub on a DDR5 RDIMM to support out-of-band communication between components.

Micron DDR5 RDIMM SPD Hub
Micron DDR5 RDIMM SPD Hub

This is used, along with two temperature sensors on either side of the DIMM to provide more sensor information.

Micron DDR5 RDIMM Temp Sensors
Micron DDR5 RDIMM Temp Sensors

We have seen baseboard management controllers with dramatically better performance, such as from the ASPEED AST2500 to AST2600 generations. Part of the reason for this is that with more sensors, the BMC can be made more aware of what is going on in a server, and therefore make more informed adjustments to fan speeds.

Our readers may hear that DDR5 RDIMMs now have more components. Hopefully, this helps to explain what those components are.

Next, let us take a look at DDR5 and the new On-Chip ECC feature and then get into the performance.

19 COMMENTS

  1. If you’re reading this, page 2 is where it’s at. I learned more in 5 minutes reading through that then everything I’ve seen on Reg DIMMs before.

  2. This is sooooooooooooo gooood. I’d +1 Uzman77’s rep on the 2nd page. That’s the best explanation I’ve ever seen. I’m usually only on STH to troll comments, but that was useful

  3. I find it disappointing that a new higher-bandwidth higher-capacity higher-latency memory standard has been developed and people are still considering non-ECC DIMMs.

    At the frequencies and densities where DDR5 makes sense, I think allowing the CPU to verify RAM integrity using ECC is important. Even if rowhammer and other ways of inducing memory errors through code didn’t exist, the tradeoff between reliability, the associated costs of memory corruption and adding two more chips per DIMM favours ECC, at least in my opinion.

    It would be nice explore the ECC DDR5 options available for desktop computers.

  4. TL;DR
    1. We need DDR5 over DDR4 because capacity, more memory channels on server CPUs for DDR5 and higher bandwidth
    2. U can’t use UDIMM anymore BC… you can’t use UDIMM anymore.
    3. 2-channel DDR5 hack is in no way fundamental. It’s just a hack. But for some reason you have to have it.
    Yes DDR5 has bigger burst step, but that was the case at any generation change…
    4. On chip ECC – another bullshit fudge, heavilly used by marketing. In reality, DDR5 RAM cell shrinkage has dropped its reliability too low, so that had to be countered on-chip. So, it’s not an improvement but a patch for a cell, that can’t be shrunk without a compromise.

  5. Nice little AMD fluff piece.

    Is the latest AMD even shipping? I have 13 dual socket Supermicro 1U servers – each with SPR and 2TB DDR5 ECC – not to mention 16 Supermicro dual socket workstations – each with a single SPR and 1TB DDR5 ECC – they will go great with the 16 GPU DGX H100…

    “4. On chip ECC – another bullshit fudge, heavilly used by marketing. In reality, DDR5 RAM cell shrinkage has dropped its reliability too low, so that had to be countered on-chip. So, it’s not an improvement but a patch for a cell, that can’t be shrunk without a compromise.”

    DDR5 has on chip ECC. It is from Engineering, not Marketing. What is your basis for claiming that reliability is too low? The Gnome living in your nightstand does not count as a source.

    So many Desktop kiddies thinking they know something about servers.

  6. How about Dr. Ian Cutress? https://www.youtube.com/watch?v=XGwcPzBJCh0&t=3m33s

    Minie was talking about the cell reliability without this correction. The point is that on-die ECC exists because with increasing density the factors causing bit-flips have increased to the point that it’s impossible to get a low enough defect rate without some form of built-in correction.

  7. Ummm….way to ignore CAS latency pretty much entirely. It’s why, to this day, top end DDR4 kits will outperform even up to mid-range DDR5 kits. Hell, they’ll even outperform some of the lower top-end DDR5 kits. CAS latency is supreme. In desktop environments and ESPECIALLY server environments.

    Also, this supposed “dual channel” thing they talk about, which is a grossly incorrect term, is reminiscent of the AMD bulldozer days where they’d claim they were splitting up the load between “cores”, but it was still only one pipe going out of the processor, this negating any real world benefit.

    The main reason that DDR5 systems perform better is because the platforms and CPU architectures are better. It’s got very little to do with the RAM itself.

  8. Another great article by STH. Bravo lads.

    Joe I’m seeing like $168 per so $4k.

    Dissident Aggressor. I don’t see how it’s an AMD fluff piece. It is nice that you’ve only got a few servers. The DRAM vendors Samsung, Hynix, and Micron all talk about the bit flips in technical conferences. I can tell you from experience that even DDR4 had massive issues. We’ve got hundreds of thousands of DDR4 modules installed in just the data center I’m responsible for. The newer 1x modules saw an increase in errors. Samsung’s are much worse than they used to be. That’s why they’re doing on chip ECC with DDR5. 29 systems is less than a quarter of a rack for us and we’ve got many thousands of racks. I’m not sure who you’re talking about with the “desktop kiddies” but you sound like one based on your comments. I’ve worked at three different hyperscalers and one large social network over the last 10 years and my colleagues all are on STH because there’s good info here. Anandtech used to be good 7+ years ago.

    ChipBoundary with CAS 40 is like 92-93ns on DDR5-4800 and it’s like 90ns on DDR4-3200 IIRC. So it’s like 50% more BW, the dual channel helps a small amount (we’ve measured it so that’ll make it into a paper). Between those two it’s better than DDR4 and it’s much better.

  9. Even as far back as Sandy Bridge memory bandwidth was starting to impact some applications. It became particularly noticeable with Haswell’s AVX2 and FMA3: 4 cores were starved by 2 channels of DDR3 1600 memory and it began to make sense to underclock the CPUs.

    It would be nice to have a consumer system with 16 cores and 8 sticks and not 2 to restore some balance. AVX512 needs it.

  10. @Minie Marimba, totally agree on your point 2. Why isn’t the voltage the same. AFAIC, stepping down fron higher voltage increases efiiciency. Why are desktop DIMMs designed to be less efficient. Maybe they save 10-20 cents from the PMIC but they butcher any chance for compatibility.

    This is especially strange because the industry has been moving towards 12v for everything. So the MB will probably need to have 12V->5V VR to feed the DIMMs, whereas it could just pass through the 12V it receives from the PSU. I must be wrong somewhere because this makes no sense whatsoever. Unless one is inclined to entertain the possibility that this is done specifically for market segmentation..

    On point 3 – the dual channel nature is actually counterproductive. It increases the cost of ECC because you are moving from 12.5% redundancy to 25% redundancy (e.g. from 8+1 to 8+2 chips). WTF? DDR5 (as DDR4 by the way) supports in-band ECC. Sure, without a place to store the actual ECC data this can only protect the data in transit over the DDR bus but not in rest in the capacitor array. However, coupled with the internal ECC, that is even necessary at current semiconductor densities, this could have offered full ECC almost for free. Yes, almost because (1) it introduces additional cycle in the burst sequence for the transfer of ECC data and (2) because of the internal ECC overhead. However, the overhead from (1) is not more than 10% (pulled out of my ass – it’s slightly higher than 10% but you can never utilize the bus completely, so it it will not make as much difference; I would argue it’s more like 7-8% in real life situations). And (2) is already used anyway. And when you have the ECC over an entire internal row, istead of 32-bit only, you can have much lower overhead and/or better protection, e.g. correct several errouneous bits.

    Again, I’m probably wrong somewhere because the state of affairs does not jive with what the techonologies can provide. One reason might be that memory chips are designed to be as simple as possibly because you need many of them and any overhead can hit hard. But, I would argue, not as hard as an additional chip for every rank in a module.

    Just my 22 cents

  11. @Nikolay Mihaylov
    “Unless one is inclined to entertain the possibility that this is done specifically for market segmentation..”

    Looking at it in any other way indicates a lack of comprehension of how the modern economy works. Interoperability between server and desktop platforms is anathema to the manufacturers/chaebols/cartels whoever you want to blame, and it is specifically allowed to be artificially implemented [through software/firmware/unnecessary physical incompatibilities] to drive margins on “Server” gear and to keep a layer of obfuscation between consumer and pro product lines, even if the difference between a Gaming CPU and a Server CPU doesn’t necessarily warrant it.

    It’s only going to get worse as Intel’s scheme to pay to enable hardware features that ship complete on the chip isn’t really getting any serious pushback.

  12. Few correction on the CXL section :
    1.
    “Latency is roughly the same as accessing memory across a coherent processor-to-processor link”
    Should be
    “Latency is roughly the same as accessing memory across a coherent socket-to-socket link”

    2.
    The aim of the CXL consortium is to make the latecney of direct attached CXL in the same ballpark as of socket NUMA hop, however we are not yet there.

    3.
    CXL/PCIe bandwidth in bidirectional, therefore the eqvivilant raw bandwidth of the suggest card is of *four* DDR5 channels.

  13. Too much e-peen competition in these comments.

    Some of you sounds like amazing folks to have to put up with, I feel back for your co-workers.

  14. You didn’t explain why server platforms no longer support consumer-level DIMMs. This is very upsetting, it will make those platforms prohibitively expensive to home enthusiasts even down the line when they flood the second-hand market.

  15. They did explain it. Different operating voltage with the power management onboard. They’re now physically keyed differently.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.