At Hot Chips 29 (2017) AMD presented new details about its newest server chip. Two details we thought would be interesting to STH readers: the Infinity Fabric update as well as the Multi-Chip Module (MCM) cost savings. If you want to read more in-depth, check out our AMD EPYC 7000 Series Architecture Overview for Non-CE or EE Majors.
AMD EPYC Infinity Fabric Update
One area that we noticed a significant delta in AMD EPYC Infinity Fabric during our testing was on-package versus socket-to-socket bandwidth. See AMD EPYC Infinity Fabric Latency DDR4 2400 v 2666: A Snapshot. At Hot Chips, AMD revealed a key detail in terms of the die-to-package topology: there are four die-to-die Infinity Fabric links per die, but only three are used.
The key reason for this is to maintain a high-performance on-package link with minimal trace lengths. We were told in a private briefing that the above is an artists rendition but based on their actual die layout. One can see that only three Infinity Fabric links are connected on each die.
For those wondering about dual socket configurations, the G0-G3 I/O links are used for socket-to-socket communication.
Also interesting about that diagram is the DDR4 linkage You will notice that the channels are represented as MA, MD, MC, MB instead of MA, MB, MC, MD working from the outside in. MA and MB are connected to one die while MC and MD are connected to another.
AMD EPYC MCM v. Monolithic Cost Savings
Here is AMD’s comparison, using their modeled production costs, of going MCM versus a monolithic die at 32 cores. Key here is that adding transistors for interconnects adds costs as do redundant features. For example each die has a server controller hub but only one is fully used.
In the end, to enter the market, the MCM module makes sense. One of the major topics at the conference today was MCM with Intel’s morning paper called “Heterogeneous Modular Platform” itself advocating its EMIB interconnect for MCM modules. AMD claims that it costs 59% as much as it would to manufacture a MCM 32 core package versus a monolithic die. This includes the approximately 10% area overhead for MCM related components.
At the end of the day, monolithic dies do provide high performance, but we are moving towards a MCM world if you believe just about every relevant presentation at Hot Chips 29.