AMD EPYC 7002 Microarchitecture Updates
The new generation uses the AMD Zen2 microarchitecture, similar to the Ryzen 3000 series desktop parts in many ways. That is part of how AMD is accelerating its design cycles. We are going to cover a few of the microarchitectural updates in this section.
First, AMD added a number of new architecture features in Zen 2. One we did not see is AVX-512 or VNNI (Intel calls that DL Boost.) AMD is pursuing a “fast follower” strategy. Essentially, it is letting the software ecosystem consume Intel chips to utilize new instructions. AMD then plans to introduce instructions as the software ecosystem integrates them. That way, AMD customers are paying for instructions that are being used widely, rather than instructions that may be used in the future later.
AMD’s Zen 2 is a derivative of the original Zen, but AMD says it has attained around a 15% IPC increase. This is a big deal. The 15% increase is on the higher-end of what we have been accustomed to in generation on generation performance improvements over the last decade. That 15% has not happened in conjunction with doubling CPU core counts. AMD is getting a lot of benefit by moving to 7nm.
Zen 2 is deeper and wider on the integer compute side than the original Zen architecture. This helps on the integer side of performance. We are going to let you read the slide for more information here.
On the floating-point side, AMD also has doubled the execution capabilities. AMD still does not have AVX-512, but between adding more execution per core, and doubling core counts, AMD is on a significant improvement path.
Keeping all of these cores fed is not a simple task. More execution units mean AMD needs to get data from the main memory better than in previous generations. One of the big features is a better branch predictor. AMD actually brought their TAGE branch predictor in from the next-generation codenamed “Milan” into this generation. Better branch prediction means the data pipeline stays primed.
Additionally, caches are bigger with lower latency. This is something AMD could add with the shift to 7nm and the additional density it provides.
To reiterate, AMD’s higher cache parts now have up to 256MB of L3 cache. Combined with 512KB per core means that AMD is putting ~300MB of cache onto chips. That is enormous compared to what Intel is putting on the chip.
We only did a brief microarchitectural overview, but AMD is moving ahead with increasing IPC along with core counts.
Next, we are going to look at some of the topology impacts of this new design.