Glorious Complexity of Intel Optane DIMMs and Micron Exiting 3D XPoint

23

Micron Discontinues its 3D XPoint Operations

Less than 24 hours after the earlier part of this article was to go live showing the glorious complexity yet promise of Optane, Micron ended its program. I wanted to address why, and some of the forward-looking comments on focusing on CXL.

The Micron Statement on Discontinuing 3D XPoint Operations

Here is the prepared statement from Micron’s executive team on discontinuing its 3D XPoint offering. If you have already read it, feel free to skip the italicized quoted text and move to the analysis since this is a long statement.

One important challenge that 3D XPoint memory products face in the market is that the latency of access requires significant changes to data center applications to leverage the full benefits of 3D XPoint. These changes are complex and extremely time-consuming, requiring years of sustained industrywide effort to drive broad adoption. In addition, there are important cost-performance trade-offs that need to be characterized and optimized for each workload.

The value proposition of 3D XPoint was to operate as persistent memory at a lower cost to DRAM or as storage that is significantly faster than NAND.

In the years since 3D XPoint was first announced, data center workloads and customer requirements have continued to evolve. As data-intensive workloads proliferate and AI ramps in data-centric applications, the CPU-DRAM bandwidth has become an increasingly limiting factor of overall system performance. In addition, as CPU architectures evolve to dramatically increase CPU core count, more DRAM is needed to ensure adequate memory bandwidth per CPU core. This trend has driven ever-increasing server DRAM content.

The industry now stands on the threshold of a significant change in data center architecture — driven by the adoption of a new, high-performance interface called Compute Express Link or CXL — that will connect compute, memory and storage subsystems in the years ahead. This upcoming change creates a significant opportunity for Micron to take advantage of industry-leading innovation in technology and products to benefit our customers. We expect these new memory solutions to utilize the industry-standard CXL interface and enable our customers to achieve new levels of performance and improved total cost of ownership, or TCO, for data-hungry workloads.

On the storage front, the significantly lower cost of NAND will remain a barrier for wide adoption of 3D XPoint. Therefore, 3D XPoint-based SSD products are not expected to be anything more than a niche market over time. Memory was always the strategic long term market opportunity for 3D XPoint.

One important challenge that 3D XPoint memory products face in the market is that the latency of access requires significant changes to data center applications to leverage the full benefits of 3D XPoint. These changes are complex and extremely time-consuming, requiring years of sustained industrywide effort to drive broad adoption. In addition, there are important cost-performance trade-offs that need to be characterized and optimized for each workload.

As we develop new products using CXL, our focus is on addressing data-intensive workload requirements while reducing barriers to adoption, such as software infrastructure changes. Importantly, our development model for these newer products will be significantly more cost-effective, and we expect a higher ROI for our investments in these new technologies going forward. (Source: Micron’s prepared statements.)

Micron 3D XPoint Wafer
Micron 3D XPoint Wafer

Micron made a rational business decision, but we have gotten a lot of questions on it. Why would Micron stop 3D XPoint? The answer is very simple, it was losing $400M/ year and Micron has shareholders. I do not know Sanjay Mehrotra beyond shaking hands once, but his daughter was a colleague at PwC when I did management consulting. She is bright, and my assumption is that her father is also intellectually astute running a large semiconductor organization for years. At some point, if there is no clear path to profitability, it is a CEO’s responsibility to shareholders to cut losses, even if the technology is “cool.”

When there is a discussion about the storage, performance, and latency, hopefully, we addressed this in how it works in the earlier part of this article. Instead, in ceasing 3D XPoint operations, Micron was focusing on Compute Express Link or CXL. CXL is not a replacement for 3D XPoint, however, it may be the future of this class of memory.

Why Micron is focusing on CXL

The CXL memory is important. In the next-generation of servers, we will start to see PCIe Gen5, but also CXL atop of PCIe Gen5. Effectively the architecture we will get is that one can put a pool of memory in a system over CXL, and then any CXL component in that system can use it as though it was its own onboard memory. The impact there is that instead of having to have massive amounts of memory for the CPU and accelerators then have to duplicate data and move it between the two memory sets, we can instead have it reside in the CXL shared memory pool. Less memory total means less cost. A bigger pool means new capabilities. The extension of where CXL goes is that the CPU starts to lose its place as the dominant part of a server.

Stephen Van Doren CXL Interconnect Heterogeneous Computing Enablement
Stephen Van Doren CXL Interconnect Heterogeneous Computing Enablement

We have already covered a type of this when we discussed the IBM Power OMI which is a serial point-to-point connection at a high-level similar to CXL. One simply needs a memory controller and then one can have any medium. One could have a CXL to DDR5, but also a CXL to Optane, or CXL to GDDR6 controller. Local caches and memory will be important, but that is the game-changing feature.

CXL 2.0 Persistent Memory
CXL 2.0 Persistent Memory

Going back to our NAND SSD example and needing power-loss protection, imagine what happens when the CPU or GPU writes data to fast persistent CXL memory, then the DPU/ SmartNIC pulls the data directly from that CXL memory and pushes it over the network to disks in SSDs or large storage arrays finally deleting the CXL memory instance when the write is safe and acknowledged. No longer do we need in-SSD memory for caching nor power-loss protection. Our DPU may not need as much memory either.

VMware VMworld 2020 Project Monterey DPU With NVIDIA BlueField 2
VMware VMworld 2020 Project Monterey DPU With NVIDIA BlueField 2

Not only is this super-cool, but it is also how the market is moving. When at STH you read that this is the last server generation before CXL disrupts, this is why. It will take time, but the industry is moving this way led by the hyper-scalers.

Since 3D XPoint is not as fast as DRAM, Micron is pointing to the existential crisis at the door of Intel Optane. Should Optane occupy fast DIMM slots in a CPU, or is PCIe Gen5/ CXL fast enough to service persistent memory? If PCIe Gen5 and CXL are a better fit, then Intel can adapt the technology by adding a CXL to Optane controller and that is the model. Indeed, this is what Micron is focusing on while opening the door to using other persistent memory technology.

Intel Optane DC Persistent Memory Green DRAM Side
Intel Optane DC Persistent Memory Green DRAM Side

For Optane PMem to survive, and what Micron is responding to Intel must:

  1. Expand accessibility. The “Optane Tax” for using M and L SKUs was reduced, but it needs to disappear with Ice Lake Xeons. Also, with Intel now behind other vendors on core counts and per-socket performance, having a high-performance storage/ high-capacity memory solution limited to Intel is not helping the high-end market. The Intel DC P5800X with the AMD EPYC 7003 “Milan” is an awesome combination, but this is a next-step.
  2. Get cost-competitive with DRAM for main memory. Here the challenge is always DRAM pricing. Expanding the use cases for Optane will help drive volume here.
  3. It has to get competitive with CXL persistent memory which can be accessed across devices and vendors. This should be an “easy” path for Intel since it would simply need a CXL to Optane controller. By easy, I mean logically easy to conceive the controller, but I am not the one designing it.

That all assumes there is Optane Memory in the near future. This is and should be a question right now for Intel. The fab producing 3D XPoint is being sold so where the actual wafers come from is a question. For its part, Intel gave us the following statement when we asked about the impact it will have:

“Micron’s announcement doesn’t change our strategy for Intel Optane or our ability to supply Intel Optane products to our customers.” (Source: Intel statement to STH)

So it seems business as usual despite the fact that the fab is being sold and so production seems to be one that we need to take Intel’s word for if we believe them. Our assumption is that Micron let Intel know that it would be stopping 3D XPoint efforts and Intel has taken mitigation steps to source supply from elsewhere. Again, there are some assumptions that we hope to be correct, but may not be.

Final Words

This piece was an idea in December for a short 500 word “here is how we use Optane” article for the holiday season. It has turned into a ~4500-word article that gets one level below the top-level “Optane is like memory and flash combined” mantra. It is still woefully high-level.

Intel Optane DCPMM PMem 100 X24
Intel Optane DCPMM PMem 100 X24

Still, the fun part of 3D XPoint and Intel Optane is that it is still a relatively new technology. If you want persistent memory that is not a NVDIMM solution, this is the easiest to purchase. You can even buy Optane DIMMs at Walmart.com today which is not true from upcoming competitors.

If the glorious complexity of Optane Persistent Memory was not enough from a technical implementation side, we now get the commercial complexity of production capabilities. Perhaps we can thank Micron for adding mystique to the PMem 200 series that will take the technology further in the Ice Lake Xeon generation.

What is clear, having now used and deployed the technology, is that while it is more complex than one may expect, it is great. There are no worries with capacitors or battery packs that we would have with NVDIMMs (albeit those are higher performance.) We get capacity that acts as a new tier of storage. All of this happens while replacing enough legacy components to lower our node costs by thousands of dollars. The twelve DIMMs in the system and the 24x first-gen DIMMs in the above photo are heading to our hosting clusters for a reason.

The road ahead certainly has challenges. Intel may discontinue Optane at some point. Still, once you understand what persistent memory technology is and is not good for, the possibilities for the future are nothing sort of tantalizing.

23 COMMENTS

  1. You’re slow on the Micron announcement coverage. I thought STH was going to skip. Then this comes out. I’m OK if you’re slow but you get to this depth.

  2. Not surprised that Micron exited – I would imagine Intel is the logical buyer for the facilities – and Intel does have its own 3DXpoint production line.

    What Micron’s exit does is completely quell the talk about AMD using the Micron supplied NVDIMMs in its servers. But with the ultra low volume of Epycs installed – would likely have only covered the coffee and donuts in the break rooms.

    With Intel moving its NAND lines to SKHynix, looks like the former Micron fab would make sense

  3. I thought the title was Borat too. Very nice wifiholic.

    Did anyone see the namespaces screenshot. It’s Patricks “Have an awesome day” sign off line. I was like “oh-no-you-didn’t” home skillet.

    Great article.

    Instead of doing all this newsi stuff, I’d like Patrick to just explain complex tech like this.

    Oh and Bob D – you’ve embarassed yourself enough here. Move on. You’re the one who said STH only gets old stuff. There’s dcpmm 200 here. They did cooper. Zeeeeeeeeeeeero credibility.

  4. Here’s one bit of complexity.

    Unlike the DRAM, the xpt has a write failure rate nowhere close to 100%. It tries to hide this by reading after the write and redoing the failures… Which is almost transparent to the programmer except when sometimes an operation takes a few 100s more cycles than it usually takes.

    For interactive parallelism is concerned and the question is “Y Slow?” it is usually that when you break work into N chunks, you have to wait for the longest N chunk to finish and that is not like averaging where increasing N evens things out, instead the variation gets more violent as N increases. When N>64 you have moved past simple load balancing and profiling has fixed the worst problems, further scaling is a “search-and-destroy” mission for exceptional events of that type.

    Also if an operation gets delayed there is more time for something to happen with other threads, you might miss the happy path for lock elision and now it is 10,000 cycles, etc.

    Thinking as a high-performance programmer xpt doesn’t feel like DRAM to me. As a low performance programmer who uses 1 Python lambda function to turn on a light bulb it does.

    Speaking of threads, what happens to the memory controller and cache system when an optane write is stuck? Is everything else running on all cylinders? Or could other memory transactions (for instance to to the same Optane stick, or to another Optane, or a DRAM) get delayed?

    Intel seems to have basically gotten the math right for the memory controller but what opportunities were missed along the way? Would Intel have served customers better if it was focused 100% on DRAM?

    xpt always seemed to be from another planet where you have a tablet with 32 GB xpt and an e-ink screen; it is like the ATMEL program memory cell that made the AVR8 a legend and it ought to be taking computers to “where none have gone before” instead of trying really hard to outcompete two world-beating technologies at the same time.

  5. I mean, odds are Intel will just… buy Micron’s factory, no? They just committed to doing more die manufacturing in-country, it’s a no-brainer

  6. Thank you STH / Patrick for this article!

    As an old computer sciences professor said: “Computing is the science/art of moving bytes.” The rest is just more explanations.

  7. Andrew, I would have presumed that the SKHynix deal would have involved targeting their fans for future production?

  8. domih

    it’s in Knuth TAOCP there is nothing in computing unintelligible to a railroad sidings master and turntable.

    I paraphrase but that’s essentially it

  9. of course nothing stops OPTANE / 3DXP turning up with a CXL or other serial interface, or being integrated by server manufacturers to buffer gpu memory..

  10. it belatedly occurs to me that OPTANE may be on life support because of government and military supply contracts which require long commitment and the cost of support possibly including very long life sku resupply together with the very quiet position of micron throughout this history could be explained by legal necessity of dual source

  11. You really love your Optane but you don’t really do research on mem business:

    1. Going back to IMFT it was Micron that did almost all of the R&D for both NAND and 3DXpoint and Intel paid Micron handsomely on a querterly base. Look at the 10-k and 10-q of that time. Intel has very little memory R&D.

    2. “Micron’s announcement doesn’t change our strategy for Intel Optane or our ability to supply Intel Optane products to our customers”

    That’s the same stuff they said aout NAND when IMFT split but Intel did no further development. Intel remained on floating gate tech and the only thing they did was to add another stack on top to get to 144L (even that was probably Micron#s development but they opted for the move to charge trap).

    Then they sold all their NAND cabs and controller tech to Hynix but they are still not allowed to sell the technology that Micron developed so the actual transaction for the NAND fabrication and IP has to wait until March 2025.

    3. Intel is most definitely not buying the Lehi Fab from Micron.
    It is very unlikely that Micron would make such an announcement when they were expecting a sale to Intel. They would have waited until they could just say that they sold all of 3DXpoint to Intel.

    Micron has also moved out a lot of tools to other (DRAM) fabs due to “underutlization”. It was announced during an earnings call not to long ago. If Intel showed interest in buying this fab micron would have kept all tools there and sold it as a 3DXPoint fab.

    4. You’re comparisons to “todays” QLC-NAND are just weird. If any competitor would see a market for it they could very easily develop SLC NAND chips with very short string lengths (z-nand and xl-flash did that partially). These chips would be much more expensive than todays standard TLC/QLC chips but they would still be much cheaper than Optane. Micron is seeing that cost problem (paragraph “on the storage front”) but you are not. Optane has no place in storage.

    5. Power-loss protection for SSDs is not considered a challenge to anyone and it’s absolute standard and working fine.

  12. “ That ratio is a big challenge though. For example, if one wants to use 128GB DIMMs with a 4:1 ratio, then one needs to populate a 32GB RDIMM and a 128GB PMem module in the same memory channel.”

    I remember Intel taking a stab at using octane caches with slower storage devices(after a similar attempt at using NAND caches with HDDs, both seem to have sunk into obscurity fairly quickly): is there, or has anyone proposed, an option that moves the optane right onto the same DIMM as the DRAM it is serving as a cache for in order to reduce the complexity of populating the memory channels? Or is that simply not supported by the memory controller and/or a nightmare of SKU proliferation since the desired DRAM/Optane ratio is now baked into the DIMM rather than being configurable by purchasing a selection of DIMMs to suit?

    Aside from all that, the optane application that seems like it could be really cool is in very low power/often sleeping applications.

    If you take a nonvolatile RAM-like and move it all the way in to the CPU or MCU: cache, registers, everything; not just RAM, you get a device that no longer need have a distinct sleep/wake cycle or a boot process(except for the first time or if you need to flush problematic state and start fresh): if you have enough power to execute at least one instruction to completion you run, otherwise you stop.

  13. “CXL is not a replacement for 3D XPoint”

    CXL can very well be a replacement for 3DXPoint: CXL allows large amounts of DRAM (dozens of TB) to be accessed fast and with low latency. DRAM capacity won’t be limited by channels and ranks anymore.

    In general I think you are overstating then benefits of Optane persistency in in-memory-DB applications as most real-world DB usages don’t have enough writes/merges to be limited by stores/logs on SSDs.

    SAP HANA runs just fine on DRAM/SSD combos and Optane is barely used. It’s mostly used when additional DRAM capacity gets extremely expensive (2DS 256GB DIMMS) or that capacity is impossible to achieve with DIMMs. CXL could really help with more linear DRAM capacity-cost.

  14. @Patrick “Lasertoe – CXL is like a transport link. 3D XPoint is a type of media.”

    Yes it is a transport link. CXL.mem protocol (coherent) and CXL/PCIE5.0 to DDR4/DDR5 controllers with lots of DIMMs can replace 3DXPoint for many applications (seems to me that Micron is working on something just like that):

    You can have much more DRAM connected to your CPU this way. Example:
    Instead of just having 8 DDR-channels connected to DIMMs you can have 8 or 16 CXL links connected to mem controllers with 8 DDR-channels each. That will allow a multiple of the DRAM capacity that is possible today.

    That will eliminate any capacity advantage of Optane. It will just be about DRAM cost and whether (full) persitency is really necessary or not.

  15. Bob Dobbs, Intel does not have its own 3D X-Point production line. The only production line is Micron’s.

  16. A useful writeup on the tech, though anyone interested in the tech could have found most of this out a while ago, just nice to see it all in a Google searchable article.

    Optane for ZIL/Slog is not a game-changer for the industry which on the whole doesn’t even consider ZFS a key storage layer in any respectable high-end stack; what the kids play with doesn’t bring home the golden eggs I’m afraid.

    However this will inevitably flood the market with remaining Optane stock at affordable prices, we’re already seeing the trend over in the chinese B2B segment. That is a positive for the aforementioned kids.

    CXL is most definitely the way forward. Micron made a wise decision, to be frank I’m surprised they held-on as long as they did.

    Will be interesting to see where Intel go next, they’re losing ground and direction across most sectors currently.

  17. Intel needs to stick to a segment rather than giving up every time it feels a little pressure.

    Optane has a lot of potential. It needs time to grow, and they need to work not only on getting it work with all their Xeons, but work more like regular DDR so it can be used on AMD and ARM systems that are willing to support it.

    The P5800X with it’s second generation 3D XPoint improves sequential read/write performance by 3x. If we get the same with “Crow Pass” 3rd Gen Optane PM DIMMs, we’ll have 20GB/s read and 10GB/s writes, and also get the latency even lower.

    They said 3rd Gen DIMMs are an Inflection Point so we’ll see in what area it evolves in. I really hope it moves to fully open it up.

    You know how big companies with acquisitions always claiming they’ll treat the acquisitions as a separate entity? They almost always don’t and the main company meddles too much in it causing it to fail.

  18. Is it possible to use Pmem 200 with 2nd Gen? then you would at least be able to run 2933 as designed?

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.