CXL Paradigm Shift ASUS RS520QA-E13-RS8U 2U 4-Node Server Review

1

Key Lessons Learned

The addition of CXL memory is a game-changer here. Without CXL Type-3 devices, you are stuck making a trade-off. If you want more memory capacity, you can use higher-density (more expensive) DIMMs. Alternatively, you can sacrifice node density and memory speed and move to 24 DIMM per socket 2DPC servers. The final option is that you can move to much harder to cool in-line dual CPU configurations that add a lot of power consumption and cost by adding a second CPU socket. Now, with the ASUS RS520QA-E13-RS8U, you can add memory capacity without sacrificing speed or density.

ASUS RS520QA E13 RS8U Lspci Cxl
ASUS RS520QA E13 RS8U Lspci Cxl

CXL memory operating at DDR5-4400 speeds and in a remote NUMA node is certainly not the highest-performing option. On the other hand, it strikes a great balance of performance, cost, and density.

ASUS RS520QA E13 RS8U Topology
ASUS RS520QA E13 RS8U Topology

Utilizing CXL, we get more memory capacity on a single CPU while still reaping the 2U 4-node density benefits. When we reviewed 2U4N servers previously, many STH readers and OEMs said that a common reason to use two lower core count CPUs instead of one was simply to get more DIMM slots. This fixes the associated modern challenge of how to cool a second in-line CPU in a half-width 1U node as well.

ASUS RS520QA E13 RS8U Heatsink Heatpipes
ASUS RS520QA E13 RS8U Heatsink Heatpipes

To be clear, this is not the perfect solution for every application. At the same time, having eight CXL DIMMs of expandability per node, or adding 32 DIMMs per 2U chassis that would normally hold 48 DIMMs is significant for those that need more memory capacity.

ASUS RS520QA E13 RS8U Rear CXL Modules 2
ASUS RS520QA E13 RS8U Rear CXL Modules 2

If nothing else, this was just a cool server because it shows something else. Many hyper-scalers are not deploying CXL in PCIe cards or EDSFF form factors. Instead, they are using cards like these to provide better cooling to the modules. (See an example we showed recently.) This is one of the first non-hyper-scale systems we have seen outside of a scale-up system we saw that incorporates this type of CXL integration.

STH Server Spider: ASUS RS520QA-E13-RS8U

In the second half of 2018, we introduced the STH Server Spider as a quick reference to where a server system’s aptitude lies. Our goal is to start giving a quick visual depiction of the types of parameters that a server is targeted at.

STH Server Spider ASUS RS520QA-E13-RS8U
STH Server Spider ASUS RS520QA-E13-RS8U

What is interesting about this server is that a 2U 4-node AMD EPYC 9005 platform is interesting itself. Adding the additional memory via CXL skews the memory density over the CPU density because it removes the need for a second socket per node to add more memory. On the other hand, this does not have the maximum NVMe storage or I/O capacity for servers.

Final Words

Personally, I have become a big fan of single-socket AMD EPYC platforms. When we deploy them, there is always a question of do we move to larger memory modules, do we move to 2DPC, or do we even need to have that many slots to begin with. The ASUS RS520QA-E13-RS8U offers something really different in the 2U 4-node form factor, which is neat. It is also one of the most unique CXL Type-3 memory expansion device servers that we have seen.

ASUS RS520QA E13 RS8U Rear Angle
ASUS RS520QA E13 RS8U Rear Angle

Hopefully, our STH community likes this look at a unique CXL server. It was a great opportunity to show off what one of these can do. When we were able to see it on a trip to Taipei earlier this year, I was very excited to get to bring this to you.

Patrick With ASUS RS520QA E13 RS8U Running In Taipei
Patrick With ASUS RS520QA E13 RS8U Running In Taipei

In the future, expect many cool CXL designs to start making their way into mainstream servers.

1 COMMENT

  1. It’s nice, but I much prefer CXL memory expansion to be sharable between all the CPUs in a box instead of being a much simpler expansion, where a block of memory (8 DIMMs per node times 4) is not fully shared (32 DIMMs splittable (dynamically allocated) between 4 nodes). Examples in STH articles: “Lenovo Has a CXL Memory Monster with 128x 128GB DDR5 DIMMs” and “Inventec 96 DIMM CXL Expansion Box at OCP Summit 2024 for TBs of Memory”.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.