This week Amazon announced a new AWS EC2 c4 instance type. These instances are backed by new Intel Xeon E5-2666 V3 processors. The Intel Xeon E5-2666 V3 is a custom part Intel is spinning for Amazon. We do know it has a 2.9GHz base clock and a maximum turbo frequency of 3.5GHz. In all of our test instances, Linux-Bench picked up this processor being used. Let’s take a look at the performance. Last year we used Linux-Bench to show the burstable performance limits of the AWS EC2 t2 instance types. Since then the suite has been adopted by Tom’s Hardware and Anandtech. Luckily, as a scripted benchmark, these numbers can be reasonably compared.
For this test we used the Ubuntu 14.04 LTS HVM instance type which is a fairly standard AWS image. Ubuntu 14.04 LTS is the current Linux-Bench standard OS. The AWS image also works out of the box with Linux-Bench which is a plus. We setup all of our instances in the US East northern Virginia location. We setup instances of the following types:
Each instance ran the test one time. We used different instances for each test to mimic what we do between each STH benchmark session where we restart the configuration from scratch each run. We also had configurations of dual Xeon E5-2690 V1, dual Xeon E5-2690 V2, dual Xeon E5-2690 v3 and dual Xeon E5-2699 V3 systems for comparison.
The AWS EC2 c4 Instances
If one wants to learn more about the AWS EC2 instances, this is the go-to resource. Just as a quick comparison of the compute side scaling, here is how the instances reported their compute resources:
As one can see, the c4.8xlarge reports as crossing two sockets and 36 cores. The other instances are all single socket.
AWS EC2 c4 Benchmark Results
For these tests we are using the standard Linux-Bench test suite. If you have an existing server, one can run the suite using a Ubuntu 14.04 LTS LiveCD and issuing three commands by following the simple Linux-Bench how-to. That means if you want to compare these results to what you already have deployed, the path is about as simple as it gets. So as not to re-create massive amounts of text, one can read about the benchmarks here.
c-ray 1.1 is a ray tracing benchmark. We are only presenting the medium and hard results as our easy test is completed in <1 second for several of the bare metal and c4.8xlarge configurations. First the medium test:
c-ray does well with threading so we see the Intel Xeon E5-2690 V1 configuration best all but the c4.8xlarge. This is to be more or less expected but it does provide a good comparison point versus a typical legacy bare metal server.
On the harder test we see a similar pattern. Just as a point of comparison, a low power Intel Atom C2750 will score around 308 seconds on the hard test putting it a bit slower than the c4.8xlarge but considerably faster than the c4.large.
HardInfo is the “default” Ubuntu benchmark and the cryptohash cryptography tests is a mainstay piece of the benchmark.
Here we can see the advantage Haswell-EP cores have over previous generation Sandy Bridge-EP cores. We would normally expect the c4.8xlarge to be ahead of the c4.4xlarge by a solid margin but for some reason the results for both instances diverged. We will likely investigate this one further as we are able to collect more data.
OpenSSL is virtually everywhere these days making it a highly useful benchmark.
On the sign side, we can see the c4.xlarge and c4.large in their own performance tier. The c4.8xlarge is roughly equivalent to a high-end Ivy Bridge-EP system.
On the Verify side we see a very similar picture.
NAMD is one of the script’s highly parallel benchmarks.
One can certainly see the pattern here. The AWS EC2 c4 instances do cover a wide range of performance capabilities.
Compression with 7-zip
Compression is a key task of server CPUs to minimize storage space and transmission sizes.
On both the compression and decompression sides we can see similar patterns emerge. The E5-2699 V3 is an expensive high-end processor in dual socket mode and is able to show its brawn here. The scaling among c4 instances is relatively linear.
Sysbench CPU Benchmark
Sysbench is another favorite Linux benchmark. We use the CPU test to look at both single and multi-threaded performance.
For single threaded applications, this chart shows relatively consistent performance across instances and with our bare metal configurations. That is more of a function of using single Xeon E5 cores in each case. On the multi-threaded side, we can start to see the same trends we saw in other benchmarks.
Overall, the fact that Amazon is using a new Haswell-EP part is great. That is evidenced by the single-threaded consistency we saw. The c4.8xlarge is not inexpensive though. For the instance with 60GB of RAM one will pay around $1336/ month. On the other end of the spectrum, the c4.large, which is significantly slower than a September 2013 Intel Atom C2750 is only $83/ month. Amazon does have discounts for reserved instances and the like which can help from a price perspective. It will come down to the individual application to see if the price/ performance ratio is right for an application. One can simply fire up Linux-Bench to see how current systems compare to these figures: