AMD EPYC 7F72 Benchmarks and Review 24 Cores

9
AMD EPYC 7F72 Cover
AMD EPYC 7F72 Cover

The AMD EPYC 7F72 is a 24-core frequency optimized part. Although AMD is marketing this as one of its “F” parts, there is a key difference. AMD is pricing this SKU at a relatively mundane premium compared to the other “F” parts. In many ways, it seems more like a next higher-up SKU rather than a pure frequency optimized part. With a 10-15% clock speed boost and 50% more cache, there is a lot to like about AMD’s newest 24 core/ 48 thread part. In our benchmarks and review piece, we are going to show what has changed and what the performance impacts are.

AMD EPYC 7F72 Overview

Key stats for the AMD EPYC 7F72: 24 cores / 48 threads with a 3.2GHz base clock and 3.7GHz turbo boost. There is 192MB of onboard L3 cache. The CPU features a 240W TDP. These are $2,450 list price parts.

Here is the lscpu output for the AMD EPYC 7F72:

AMD EPYC 7F72 Lscpu Output
AMD EPYC 7F72 Lscpu Output

Perhaps one of the biggest changes is the TDP. With an increase from the AMD EPYC 7402 180W TDP to 240W. That extra TDP headroom helps provide the budge for an additional 400MHz base and 350MHz turbo clocks. Another way to think of that is that the base clock of the EPYC 7F72 is only 150MHz lower than the turbo clock of the EPYC 7402. What is more, although we still have 24 cores, we now get 192MB of L3 cache which seems to indicate a change in the number of active chiplets on the parts.

To illustrate this, here is the topology of the EPYC 7F72:

AMD EPYC 7F72 Topology
AMD EPYC 7F72 Topology

One can see that each core has two threads along with the 32KB L1i + L1d cache and its own 512KB of L2 cache. The big change happens with the L3. Here we see 16MB of L3 cache shared across two cores. In the EPYC 7402(P) we had 16MB of L3 cache shared directly among three cores:

AMD EPYC 7402P In Gigabyte R272 Z72 NVMe Topology
AMD EPYC 7402P In Gigabyte R272 Z72 NVMe Topology

What this seems to indicate is that while the AMD EPYC 7402(P) has three cores active per CCX, the EPYC 7F72 has two. This means each core gets around 50% more L3 cache on its chiplet before having to go to the I/O die in AMD’s design.

Putting this part into some context, here is what the new 8-24 core, dual-socket capable SKU stack looks like from AMD.

AMD EPYC 7Fx2 Launch SKUs With Other 24C And Lower Chips
AMD EPYC 7Fx2 Launch SKUs With Other 24C And Lower Chips

As you can see, these new 7Fx2 chips offer strikingly higher TDP and significantly higher prices per core than alternative options along with their clock speeds. AMD is able to do this because they are offering significantly more performance per core. Here is AMD’s slide with the above subset of CPUs showing its competitive impact using dual-socket SPECrate2017_int_base which is a widely used benchmark for data center purchasing.

AMD EPYC 7Fx2 Launch Slides Performance Per Dollar
AMD EPYC 7Fx2 Launch Slides Performance Per Dollar

Here AMD is aiming for leadership performance, as well as leadership performance per dollar. We are going to discuss this more in our market impact section, but let us be clear, AMD is now extracting a premium for this capability. Here are the new SKUs in context of other AMD EPYC SKUs on a $(USD) list price/ core basis:

AMD EPYC 7Fx2 Launch SKUs Value Analysis Cost Per Core
AMD EPYC 7Fx2 Launch SKUs Value Analysis Cost Per Core

As one can see, the 7F72 is in-line with higher-core count parts making it an easier alternative to stepping up the core count. If you compare this to the AMD EPYC 7F52 and EPYC 7F32, the price per core looks a lot more reasonable here.

In our benchmarks, we are going to see what the premium looks like. First, we are going to take a look at the test configurations and how we received the chips.

9 COMMENTS

  1. I know STH dont normally write anything on stocks and quarterly reports. But I do wish STH write a pieces that suggest why EPYC is doing as good as most of us expected it to be. It has yet to break the 10% shipment barrier, all while Intel is making record quarter YoY in DC and HPC. That is all in the time when Intel 14nm is operating at full capacity and one node behind.

    Most have been suggesting these things takes time, but it is already a year of Zen 2 and shows no signs of improvement.

  2. Can we have the single threaded results for the UnixBench Dhrystone 2 and Whetstone Benchmarks on a separate axis or chat?

    For a part that is specifically targeting better single threaded performance, it would be nice to have a little more focus on that aspect. A lot of problem domains enjoy multi-threaded environments, there are still others which focus on raw single threaded performance. A comparison here to Intel’s similar offerings would be very beneficial.

    A great article as usual and I very much enjoy reading STH content, hopefully the feedback will be helpful. Thanks

  3. @Stephen,
    Given that this and the rest of the CPUs are intended for use in server system, why would single threaded performance matter?
    Servers are purchased/justified on the basis that they provide resources for a number of tasks, so they are never running “one thing”. Further, single threaded processes would not be representative of real usage, since each CPU has there own inherent trade offs — higher base clocks vs. more cores/threads vs. TDP.

  4. Again an irrelevant review of a core frequency optimized part.
    All your benchmarks shows that more cores are better than fewer but frequency optimized cores.
    I sound like a broken record player repeating the same request over and over….

    Please add variations of the benchmarks where there are 4 – 8 threads active!
    That will show what these parts are made for and will also show the value of turbo modes, high TDP.
    It may also show the only remaining performance reason to pick Intel in 2020.

  5. @BinkyTo,
    Latency sensitive applications care about single-threaded performance. We run multiple processes per server, there is of course a trade off between singled-threaded performance and number of cores available that we have to make. However doubling the number of cores while taking 20% off the clock speed is going to make us go slower, not faster.

    Details about turbo modes, all-core and subset core would also be very interesting and information is often difficult to find.

  6. This processor is for use with Oracle or SQL Server, too. THAT is what datacenter servers do. In those scenarios, doubling cores while dropping GHz could-very-well be a WIN.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.