Dual Intel Xeon E5-2697 V4 Benchmarks – #3 in the lineup

1
Intel A2U44X25NVMEDK installation - internal
Intel A2U44X25NVMEDK installation - internal

Today we have benchmarks of dual Intel Xeon E5-2697 V4 processors. Each Intel Xeon E5-2697 V4 has 18 core, 36 threads and 45MB L3 cache. With 2.3GHz base clock speeds, these chips are very similar to the E5-2699 V3 processors which were top of the line in the last generation. We have already published Intel Xeon E5-2699 V4 Benchmarks and Intel Xeon E5-2698 V4 Benchmarks so now it is time to take a look at the third highest model. Of note, Intel also has an E5-2697A V4 which is a 16 core / 32 thread part. Despite similar naming, we are benchmarking the 18 core parts in a dual socket configuration.

Test Configuration

Our test platform was a standard EATX motherboard upgraded for Xeon E5 V4 support via a simple BIOS upgrade. This is one of the NVMe servers we use in the Fremont colocation that we brought offline and upgraded to the V4 part.

As another note, we tried picking some interesting comparisons out of our data set and did have legacy E5-2697 V2 and V3 information for most of the benchmarks we run.

Intel Xeon E5-2697 V4 Benchmarks

For our testing we are using Linux-Bench scripts which help us see cross platform “least common denominator” results. We are using gcc due to its ubiquity as a default compiler. One can see details of each benchmark here. We are likely going to update the Linux-Bench in the near future with a few new tests as well as an even simpler to use/ faster revision, but for now, we are using our old Ubuntu 14.04.3 LTS version.

Python Linux 4.4.2 Kernel Compile Benchmark

This is one of the most requested benchmarks for STH over the past few years. We (finally) have a Linux kernel compile benchmark script that is consistent. Expect to see this functionality migrate into Linux-Bench soon (we are just awaiting the parser work on it.) The task was simple, we have a standard configuration file, the Linux 4.4.2 kernel from kernel.org, and make with every thread in the system. We are expressing results in terms of complies per hour to make the results easier to read.

Intel Xeon E5-2600 V4 Initial Linux Kernel Compile Benchmarks
Intel Xeon E5-2600 V4 Initial Linux Kernel Compile Benchmarks

As you can see, the dual Xeon E5-2697 V4 system is the fourth fastest on our charts and likely similar to what we would see if we were able to re-test the E5-2699 V3 on our new benchmark. Unfortunately, we did not have a set of the older CPUs available for re-testing the Linux kernel complie benchmark.

c-ray 1.1 Performance

We have been using c-ray for our performance testing for years now. It is a ray tracing benchmark that is extremely popular to show differences in processors under multi-threaded workloads.

Intel Xeon E5-2697 V4 c-ray Benchmark
Intel Xeon E5-2697 V4 c-ray Benchmark

The Intel Xeon E5-2697 V4 parts are so fast that we similar performance to the E5-2699 V3 parts which were also 18 core/ 36 thread parts.

7-zip is a widely used compression/ decompression program that works cross platform. We started using the program during our early days with Windows testing. It is now part of Linux-Bench.

Intel Xeon E5-2697 V4 7-zip Benchmark
Intel Xeon E5-2697 V4 7-zip Benchmark

Compression is a major operation we see in today’s workloads and is also highly threaded. We did have the Cavium ThunderX 48 core result omitted as we explained in our 96 core Cavium ThunderX benchmark piece. The Intel Xeon E5-2697 V4 does show a continued trend towards IPC improvements in these benchmarks.

NAMD Performance

NAMD is a molecular modeling benchmark developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. More information on the benchmark can be found here.

Intel Xeon E5-2697 V4 NAMD Benchmark
Intel Xeon E5-2697 V4 NAMD Benchmark

Somewhat surprisingly we find the Intel Xeon E5-2697 V4 perform better than several of the older configurations we have tested. Performance was very good on this complex benchmark.

Sysbench CPU test

Sysbench is another one of those widely used Linux benchmarks. We specifically are using the CPU test, not the OLTP test that we use for some storage testing.

Intel Xeon E5-2697 V4 Sysbench CPU Benchmark
Intel Xeon E5-2697 V4 Sysbench CPU Benchmark

We sorted this chart on the multi-threaded results. Practically that means that the blue bars representing single threaded performance would change the ranking. The single threaded results are bounded in a fairly tight range because these are all Intel processor CPUs mostly ranging between Sandy Bridge and Broadwell architectures.

OpenSSL Performance

OpenSSL is widely used to secure communications between servers. This is an important protocol in many server stacks. We first look at our sign tests:

Intel Xeon E5-2697 V4 OpenSSL Sign Benchmark
Intel Xeon E5-2697 V4 OpenSSL Sign Benchmark

Moving to the verify results:

Intel Xeon E5-2697 V4 OpenSSL Verify Benchmark
Intel Xeon E5-2697 V4 OpenSSL Verify Benchmark

At this point we see fourth place finish for the E5-2697 V4 however we do notice this is an area where the V4 architecture is strong and there is a relatively measured gap between the E5-2697 V4 and the higher dollare E5-2699 V4 parts.

UnixBench Dhrystone 2 and Whetstone Benchmarks

Of course, these chips are not meant for heavy compute but we pick out the UnixBench 5.1.3 Dhrystone 2 and Whetstone results to show some of the raw performance they are capable of. UnixBench is widely used so it is a good comparison point.

Here are the single threaded workloads:

Intel Xeon E5-2697 V4 UnixBench dhrystone 2 single thread
Intel Xeon E5-2697 V4 UnixBench dhrystone 2 single thread
Intel Xeon E5-2697 V4 UnixBench whetstone single thread
Intel Xeon E5-2697 V4 UnixBench whetstone single thread

As you can see, the results are relatively tight on our single threaded benchmarks likely due to the

Now the E5 V4’s sweet spot, the multi-threaded workloads:

Intel Xeon E5-2697 V4 UnixBench dhrystone 2 multi thread
Intel Xeon E5-2697 V4 UnixBench dhrystone 2 multi thread
Intel Xeon E5-2697 V4 UnixBench whetstone multi thread
Intel Xeon E5-2697 V4 UnixBench whetstone multi thread

In multi-threaded tasks, the Intel Xeon E5-2697 V4 performs near the top of the pack. There is very little reason to upgrade from Intel Xeon E5-2600 V3 to V4 parts, however one can see massive gains from the, at the time mainstream, E5-2670 V1 parts to the point that E5-2697 V4 provides close to 3x the floating point performance. For a 3x performance gain, that starts to become a compelling upgrade timing.

Conclusion

With the new E5 V4 CPUs, we can see substantial performance gains with the higher end parts. Starting with the E5-2697 V4 we can see modest performance gains over the similar core count V3 parts (E.g. the E5-2699 V3) however the E5-2697 V4 parts are going to be much more accessible in terms of price and availability. Intel has the luxury of being able to turn off 6 of the 24 LCC (large core count) die cores or 25% of what the silicon has. This means the E5-2697 V4 is going to be a high yield part for Intel. With the E5-2699 V3, we saw fewer chips on the market due to yield constraints. We expect the E5-2697 V4 to become widely adopted this generation.

You can find more STH Xeon E5 V4 coverage here:

Subscribe to STH to get the latest benchmarks and platform reviews as they are published. We have a huge back log of content coming.

1 COMMENT

  1. The E5-2697 v4 per core performance improvement over a v1 is not breathtaking. It is sometimes faster per core and is sometimes slower per core.

    2697v4 /2670v1
    36cores/16cores=2.25

    The best comparison is OpenSSL
    2697v4 / 2670v1
    4500 / 1600 = 2.81

    A 25% improvement per core (over the 2.25 ratio)

    7-zip is about 4% slower
    142000 / 66000 = 2.15

    Linux compile scaled with the cores, 2.25
    2697v4/2670v1
    18/8= 2.25

    Above benchmark numbers may be off a bit, they were eyeballed from the article charts.

    It would be good to see operations / watt.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.