A Journey to Next-Gen Arm Neoverse N1 and E1 Cores

7

State of the Arm Neoverse N1 and Testbed

What you are seeing here is actual Arm Neoverse N1 silicon. This is in the Arm Neoverse SoC Dawn Ares Platform. Much of what Arm discussed was done in the form of RTL implementations and models, but there are Arm Neoverse N1 cores in the wild.

Arm Neoverse N1 SoC Dawn Ares Platform 7nm
Arm Neoverse N1 SoC Dawn Ares Platform 7nm

We are going to discuss the platform, then talk about some of the performance figures we got for the Arm Neoverse N1.

Arm Neoverse N1 Dawn Ares Test Platform

Arm Neoverse N1 test platform is a relatively compact kit. The purpose of the platform is not to have places like STH benchmark (although we want to), it is instead a platform for those looking to build CPUs atop Neoverse N1 or do other ecosystem enablements. For example, if Microsoft wanted to build a Neoverse N1 64 core chip, this is the platform they may use for testing prior to getting their silicon back.

Arm Neoverse N1 System Development Platform
Arm Neoverse N1 System Development Platform

The actual Arm Neoverse N1 SoC has two MP2 N1 CPUs with 1MB L2 cache per core. There is also an 8MB system level cache and two DDR4-3200 memory controllers. When you look at this SDP, one can see that Arm is trying to give a user all of the parts necessary to simulate functions of a larger system.

Arm Neoverse N1 System Development Platform 2
Arm Neoverse N1 System Development Platform 2

Here is a shot from above. One can see the main Arm Neoverse N1 SoC. One can also see a Xilinx FPGA. On the x86 side, we are accustomed to having I/O directly on the SoC or northbridge. On Arm development platforms and even RISC-V platforms, we are seeing that offloaded to FPGAs.

Arm Neoverse N1 Dawn Ares Platform Development
Arm Neoverse N1 Dawn Ares Platform Development

Here is the official labeled overview. One will notice that the mATX form factor motherboard is relatively standard for a SDP. We have certainly seen some exotic SDPs in the past. One great example was when STH got an exclusive during the Intel Xeon D launch via a Beverly Cove SDP.

Arm Neoverse N1 System Development Platform Overview
Arm Neoverse N1 System Development Platform Overview

There are three PCIe 3.0 slots, a x16, x8 and x1 electrical. Beyond that, there is the fourth slot labeled on the diagram and also the PCIe slot as PCIe Gen 4.0 CCIX. We thought this is very interesting because putting the CCIX/ PCIe Gen 4.0 slot on the far edge means that the Gen 4.0 traces are the longest, yet furthest from the SoC.

Arm Neoverse N1 Dawn Ares Platform PCIe
Arm Neoverse N1 Dawn Ares Platform PCIe

We covered that Xilinx has CCIX enabled FPGAs, we have seen the Huawei Kunpeng 920 64-Core Arm Server CPU with CCIX and PCIe Gen4 launched. Now there is an Arm Neoverse development platform with it enabled. That is clearly a big point Arm is pushing with this SDP as CCIX opens a world of opportunity for Arm Neoverse.

Arm Neoverse N1 System Development Platform Use Cases
Arm Neoverse N1 System Development Platform Use Cases

Since this is STH, even the rear I/O panel gets a glory shot for the SDP.

Arm Neoverse N1 Dawn Ares Platform Rear IO
Arm Neoverse N1 Dawn Ares Platform Rear IO

This is not meant to be a production board. Instead, this is designed for Arm’s primary customers and some ecosystem enablement. When we were at the Tech Day 2019, we were told that low single-digit dozens of these boards had been produced. Frankly, if you are an end-user, a low core count, two DDR4 DIMM solution is not what you want to deploy. If you work at Ampere, you are likely to have engineers fight over these platforms.

Arm Neoverse N1 System Development Platform Features
Arm Neoverse N1 System Development Platform Features

Along with the hardware platform, Arm is delivering a software stack to get users up and running.

Arm Neoverse N1 System Development Platform Software Stack
Arm Neoverse N1 System Development Platform Software Stack

Again, Arm has a series of software development tools to help vendors and those in the ecosystem use the SDP.

Arm Neoverse N1 System Development Platform Software Tools
Arm Neoverse N1 System Development Platform Software Tools

Now that we have shown architectural details and physical silicon, we wanted to discuss performance.

Arm Neoverse N1 Performance

Personally, I think that giving the Arm Neoverse N1 performance talk in Q1 2019 is perhaps one of the hardest jobs. For some perspective here, Arm makes the cores and some surrounding IP. Arm does not license an entire chip. To make a complete chip, one needs IP blocks from other vendors. Once a set of core IP from Arm, 3rd party vendors, and the chip’s designer are integrated, one gets a solution ready to go to the foundry. Once chips are produced, one can benchmark actual chips. We are early in the Neoverse N1 lifecycle so having performance numbers to stand by is quite an accomplishment.

Arm Neoverse Tech Day 2019 Software Driven HW Design
Arm Neoverse Tech Day 2019 Software Driven HW Design

Arm says that it is doing a lot of hardware and software development. If you remember the STH Software Maturity Model, one can see significant gains just through optimizing code.

Arm believes that it is going to see a massive integer and floating point performance uplift with the Neoverse N1 over the Cortex-A72 based on estimating performance in RTL and using silicon.

Arm Neoverse Tech Day 2019 SPEC CPU Gains N1 Over Cortex A72
Arm Neoverse Tech Day 2019 SPEC CPU Gains N1 Over Cortex A72

One of the key themes is that Arm gets benefits both from hardware as well as software. We have consistently seen new toolchains and kernels help Arm server performance, so this makes sense. Also, NVIDIA has been heavily promoting GPUs as seeing massive gains by counting both software and hardware gains over time. Arm is using that same methodology.

Arm Neoverse Tech Day 2019 N1 HW And SW Gains
Arm Neoverse Tech Day 2019 N1 HW And SW Gains

Other examples the company showed are with virtualization, showing faster KVM restore times. Here, the time scale is notably absent.

Arm Neoverse Tech Day 2019 VM Example
Arm Neoverse Tech Day 2019 VM Example

Nginx is perhaps the world’s most prominent web server these days. STH has used it for years. It is also a very popular application for web servers. Here, Arm showing a 2x-2.5x performance gain for Neoverse N1.

Arm Neoverse Tech Day 2019 Nginx Performance
Arm Neoverse Tech Day 2019 Nginx Performance

Likewise, Arm showed some example speedups on operations for Arm Cortex-A72 versus Neoverse N1.

Arm Neoverse Tech Day 2019 Java Performance
Arm Neoverse Tech Day 2019 Java Performance

Arm did a solid job showing anticipated speedups. We think some of these figures will shift once chip designers add their own IP to the mix, but it seems like the improvements made in Arm Neoverse N1 should offer significant performance improvements. For those who want to see Arm v. Intel Xeon and AMD EPYC, check out our Cavium ThunderX2 Review and Benchmarks a Real Arm Server Option. While ThunderX2 is not using Neoverse N1, that is all of the publicly available silicon at the time of this writing. Cascade Lake early shipment has been underway for some time, but the formal launch is still forthcoming.

The Arm Neoverse N1 was perhaps the star of the show, but during the Arm Neoverse Tech Day 2019, the company also showed off its Neoverse E1 architecture for lower power edge applications.

7 COMMENTS

  1. This was a great long read.

    STH is now like a mix of the technical side of Anandtech, the business side of TNP, and adding in it’s own mix of hands on experience working with this hw. I can’t wait for your N1 review

  2. Amazing article!
    Arm is set to dominate the EDGE, I don’t really see how Intel hopes to gain any market share with the power draw of the x86 ecosystem. Given how much money they can out on R&D, we should expect to see something from them … and the Big.Little using Atom little cores doesn’t sound the right approach

  3. Risky89 – Arm Neoverse N1 CPUs will be coming out in a few quarters. The development board with the Neoverse N1 CPU is a low production unit that is primarily going to companies that are building chips.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.