Key Lessons Learned
The addition of CXL memory is a game-changer here. Without CXL Type-3 devices, you are stuck making a trade-off. If you want more memory capacity, you can use higher-density (more expensive) DIMMs. Alternatively, you can sacrifice node density and memory speed and move to 24 DIMM per socket 2DPC servers. The final option is that you can move to much harder to cool in-line dual CPU configurations that add a lot of power consumption and cost by adding a second CPU socket. Now, with the ASUS RS520QA-E13-RS8U, you can add memory capacity without sacrificing speed or density.

CXL memory operating at DDR5-4400 speeds and in a remote NUMA node is certainly not the highest-performing option. On the other hand, it strikes a great balance of performance, cost, and density.

Utilizing CXL, we get more memory capacity on a single CPU while still reaping the 2U 4-node density benefits. When we reviewed 2U4N servers previously, many STH readers and OEMs said that a common reason to use two lower core count CPUs instead of one was simply to get more DIMM slots. This fixes the associated modern challenge of how to cool a second in-line CPU in a half-width 1U node as well.

To be clear, this is not the perfect solution for every application. At the same time, having eight CXL DIMMs of expandability per node, or adding 32 DIMMs per 2U chassis that would normally hold 48 DIMMs is significant for those that need more memory capacity.

If nothing else, this was just a cool server because it shows something else. Many hyper-scalers are not deploying CXL in PCIe cards or EDSFF form factors. Instead, they are using cards like these to provide better cooling to the modules. (See an example we showed recently.) This is one of the first non-hyper-scale systems we have seen outside of a scale-up system we saw that incorporates this type of CXL integration.
STH Server Spider: ASUS RS520QA-E13-RS8U
In the second half of 2018, we introduced the STH Server Spider as a quick reference to where a server system’s aptitude lies. Our goal is to start giving a quick visual depiction of the types of parameters that a server is targeted at.

What is interesting about this server is that a 2U 4-node AMD EPYC 9005 platform is interesting itself. Adding the additional memory via CXL skews the memory density over the CPU density because it removes the need for a second socket per node to add more memory. On the other hand, this does not have the maximum NVMe storage or I/O capacity for servers.
Final Words
Personally, I have become a big fan of single-socket AMD EPYC platforms. When we deploy them, there is always a question of do we move to larger memory modules, do we move to 2DPC, or do we even need to have that many slots to begin with. The ASUS RS520QA-E13-RS8U offers something really different in the 2U 4-node form factor, which is neat. It is also one of the most unique CXL Type-3 memory expansion device servers that we have seen.

Hopefully, our STH community likes this look at a unique CXL server. It was a great opportunity to show off what one of these can do. When we were able to see it on a trip to Taipei earlier this year, I was very excited to get to bring this to you.

In the future, expect many cool CXL designs to start making their way into mainstream servers.
It’s nice, but I much prefer CXL memory expansion to be sharable between all the CPUs in a box instead of being a much simpler expansion, where a block of memory (8 DIMMs per node times 4) is not fully shared (32 DIMMs splittable (dynamically allocated) between 4 nodes). Examples in STH articles: “Lenovo Has a CXL Memory Monster with 128x 128GB DDR5 DIMMs” and “Inventec 96 DIMM CXL Expansion Box at OCP Summit 2024 for TBs of Memory”.