Perhaps one of the more interesting CPUs in the embedded server space today is the Intel Xeon E3-1585 V5. A few months ago we benchmarked the Intel Xeon E3-1515M V5. That chip was a low power version of the Intel Xeon E3-1500 V5 platform. The Intel Xeon E3-1585 V5 is a result of a modest power bump, to a 65W TDP. That also means that base clocks are greatly increased. Although it is a 4 core / 8 thread CPU, the base clock is 3.5GHz. That clock speed means that it is able to punch above its weight when it comes to performance. This chip was a pleasant surprise in terms of CPU performance. What separates the E3-1500 from the E3-1200 series is the GPU. The E3-1500 series uses Iris Pro graphics. When combined with technologies like GVT-g, this provides a powerful virtual desktop solution.
We are going to be focusing today’s review on the CPU performance, but this is certainly a chip where the key feature is the GPU side and the (extremely) attractive licensing for using Intel as a VDI GPU provider. Unlike NVIDIA GRID, Intel does not have an additional software license for virtual desktops.
Key stats for the Intel Xeon E3-1585 V5: 4 cores / 8 threads, 3.5GHz base and 3.9GHz turbo with 8MB L3 cache. The CPU features a 65W TDP. The major feature is the Intel Iris Pro Graphics P580 with 128MB eDRAM onboard. Here is the ARK page with the feature set.
Here is our basic test configuration for single socket Xeon Scalable systems:
- Barebones System: Supermicro SuperServer 5019S-TN4
- Motherboard: Supermicro MBD-X11SSV-M4F
- CPU: Intel Xeon E3-1585 V5
- RAM: 16GB (2x 8GB) DDR4 ECC unbuffered SODIMMs
- SSD: Intel DC S3710 400GB
- SATADOM: Supermicro 32GB SATADOM
We are testing this CPU as part of a barebones system. The Intel Xeon E3-1500 V5 series is a BGA platform so it is not socketed and therefore will come affixed to a motherboard.
Here is the lscpu output of the chip:
Single Intel Xeon E3-1585 V5 Benchmarks
For this exercise, we are using our legacy Linux-Bench scripts which help us see cross platform “least common denominator” results. We do have a full set of expanded benchmarks from our next-gen test suite (Linux-Bench2) so expect to see those results sprinkled in as we get a larger comparison data set built. These results have also been previewed elsewhere, such as the Intel Atom C3955 benchmarks we did so you can see larger data sets in our other reviews. Here we are going to focus on generational performance.
Python Linux 4.4.2 Kernel Compile Benchmark
This is one of the most requested benchmarks for STH over the past few years. The task was simple, we have a standard configuration file, the Linux 4.4.2 kernel from kernel.org, and make the standard auto-generated configuration utilizing every thread in the system. We are expressing results in terms of compiles per hour to make the results easier to read.
Here we can see the high clock speed rocketing the Xeon E3-1585 V5 past the new Intel Xeon Silver 4112 which is the Xeon Silver 4 core / 8 thread part.
c-ray 1.1 Performance
We have been using c-ray for our performance testing for years now. It is a ray tracing benchmark that is extremely popular to show differences in processors under multi-threaded workloads.
Again, we are seeing excellent performance due to the clock speeds. In the low power single socket segment, there are a number of chips that achieve performance through many slow cores or fewer fast cores. The Intel Xeon E3-1585 V5 is in the second camp.
7-zip Compression Performance
7-zip is a widely used compression/ decompression program that works cross platform. We started using the program during our early days with Windows testing. It is now part of Linux-Bench.
Here we see solid performance. We sorted based on decompression speed, if we had sorted on compression speed the Xeon E3-1585 V5 would have moved up a spot.
NAMD is a molecular modeling benchmark developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. More information on the benchmark can be found here. We are going to augment this with GROMACS in the next-generation Linux-Bench in the near future. With GROMACS we have been working hard to support Intel’s Skylake AVX-512 and AVX2 supporting AMD Zen architecture. Here are the comparison results for the legacy data set:
Again we see raw performance of having faster cores help the Skylake-H part achieve stellar results. If you are wondering how this chip is so close to the 8 core Intel Xeon Silver 4108, recall that that chip has a base all core clock of only 1.8GHz.
Sysbench CPU test
Sysbench is another one of those widely used Linux benchmarks. We specifically are using the CPU test, not the OLTP test that we use for some storage testing.
When working in virtual desktops, single threaded performance is often of supreme importance. Here you can see the increased clock speeds help the Skylake-H offerings.
OpenSSL is widely used to secure communications between servers. This is an important protocol in many server stacks. We first look at our sign tests:
Here are the verify results:
Again we see solid performance from this 65W TDP part.
UnixBench Dhrystone 2 and Whetstone Benchmarks
One longest-running tests is the venerable UnixBench 5.1.3 Dhrystone 2 and Whetstone results. They are certainly aging, however, we constantly get requests for them, and many angry notes when we leave them out. UnixBench is widely used so we are including it in this data set. Here are the Dhrystone 2 results:
Here are the whetstone results:
Single threaded results remain solid with a 3.9GHz boost clock that many other offerings simply cannot match.
GROMACS AVX2/ AVX-512 Enabled
We have a small GROMACS 57K Atom simulation we previewed in the first AMD EPYC 7601 Linux benchmarks piece. In Linux-Bench2 we are using an updated set that scales to 4-socket configurations with enough Atoms to fill all cores. For now, we do have a small dataset using the “57K Atom” model with the original work which remains valid, albeit we are going to transition to a larger set soon. This test will use the AVX-512 and AVX2 extensions if available.
Here we see excellent performance as the Intel Xeon E3-1585 V5 is making up for its lack of raw core count.
Moving forward, we have tweaked our GROMACS “small” test case to something that can better take advantage of more cores. The original “medium” test case we were profiling was based on PRACE Test A but we are going to be deviating from that due to run lengths we want to achieve to get steady-state performance. Here is an example of how the updated benchmark will look:
We are still working on gathering more comparison data, however, this should give one an idea of how a problem that can take advantage of AVX-512 and multiple cores scales. We have been working on these test cases and testing various architectures for months now so expect to see the small case in more CPU reviews soon.
The Intel Xeon E3-1585 V5 CPU performance is great. We are going to have more on the platform soon, but we wanted to start with sharing some of the data we have been seeing from this machine.
In terms of raw CPU power, the high base clocks and Turbo boost mean that we are getting great performance, beyond the Intel Xeon Silver 4112 and often the Intel Xeon Silver 4108. What this chip excels in is its single threaded performance which is meant to provide one or many virtual desktop users with an excellent experience.
We briefly did want to mention that the GPU has other benefits. One of them is that if your application can utilize the Quick Sync video transcoding engine, there is additional hardware acceleration present. Stay tuned for more on the platform coming soon on STH.