Recently we have been testing the new Intel Atom C2750 platform under a number of scenarios. We posted some initial figures here on the main site and have had a small Avoton benchmarking thread running in the forums for a few weeks. One thing is certain, Intel took a revolutionary step in terms of low power performance here. Today we have results using our full Linux test suite.
As a quick overview, the Intel Atom C2750 is an 8 core 4MB L2 cache chip. The Avoton based SoC has a TDP of 20W but does not include such things as memory or a BMC which is required for remote management and video output. Unlike previous generation Atom chips, there is no Hyper-Threading on the Avoton/ Rangeley platform. Furthermore, we now get an out-of-order execution pipeline. Another major development is that these chips support 32GB of 1600MHz DDR3 through a dual channel controller. This means that the chips get the same 25.6GB/s theoretical memory bandwidth as the Intel Xeon E3-1200 V2 and V3 series processors. Between the Atom S1260 generation and the C2750 Intel transitioned to the modern 22nm tri-gate manufacturing process. Clock speeds were bumped to 2.4GHz with some chips having “Turbo” boost up to 2.6GHz.
The net of this is that we are going to see VERY large performance gains, certainly not the evolutionary 5-15% gains we see in the Xeon E3 space.
Intel Atom C2750 Test Configuration
Due to the SoC design, the Intel Atom C2750 platform is very compact and requires very little to get up and running. We have three Avoton test platforms that we have been hammering for the past few weeks in different configurations. For the benchmarks today we are using one of our Supermicro A1SAi-2750F platforms. The two that we have running have been rock solid.
- Motherboard/ CPU: Supermicro A1SAi-2750F with Intel C2750 Avoton 8C/ 8T SoC
- SSD: Intel S3700 100GB SSD
- Memory: 32GB (4x 8GB) 1600MHz 1.35v Kingston KVR16LSE11/8 DDR3 SODIMMs
- OS: Ubuntu 13.10 daily (20 September 2013)
- Power supply: 1U 200w 80+ Gold PWS-203-1H
We have a second Supermicro A1SAi-2750F setup with a different memory configuration and also an ASRock C2750D4I (pre-production) sample that we have validated numbers with. This setup is the same one that we utilized for our Intel Atom C2750 Avoton power consumption piece recently.
Intel Atom C2750 Benchmarks
[pullquote_right]Bottom line: the Intel Atom C2750 is a huge step forward for Intel and the low end web hosting market.[/pullquote_right]Today we are using our standard Linux benchmarking suite to get an idea about Intel Atom C2750 performance. We utilize a clean installation with updates then install required packages and benchmark. As a direct result of user feedback, we recently had the benchmarks all put into a single script that you can use to benchmark your own systems. See Introducing the STHbench.sh Server Benchmarking Script on the forums. That guide has three simple commands that you can use with a stock installation to run our test suite and install/ compile all necessary files. No configuration is needed. It also has a development version which currently can also run on CentOS and Mint, and expands the suite to include sysbench and redis-benchmark results.
The net goal is that we want others to be able to reproduce benchmarks and compare directly to their systems. Since we do not have access to every possible configuration, we would appreciate feedback in that thread which can be as simple as posting log files to run. Help is always appreciated!
One other note, unlike most of our other reviews, we are keeping the comparison set relatively smaller with these benchmark tables. The simple reason is that this is still a “low end” server platform and small text is hard to read. Feel free to compare to other reviews such as the Intel Xeon E5-2697 V2 benchmark piece to get a comparison to other processors.
hardinfo is a well known Linux benchmark that has been around for years. It tests a number of CPU performance aspects.
This is certainly an interesting set of results. The above is sorted on the Blowfish encryption test. We see a few key themes that will be reinforced in other benchmarks. First, we see a major jump in performance from the Intel Atom S1260 to the Intel Atom C2750 generations. On that blowfish benchmark example, the C2750 is competitive with the Intel Xeon E3-1230 V3 and some rather large Amazon EC2 cloud instances. The other key performance aspect we see is that in single-threaded benchmarks, the C2750 is still significantly ahead of the older version, but does fall off from the Intel Xeon E3-1200 series.
UnixBench 5.1.3 Performance
UnixBench may be a defacto standard for Linux benchmarking these days. There are two main versions, one that tests single CPU performance on that tests multiple CPU performance. UnixBench segments these results. We run both sets of CPU tests. Here are the single threaded results:
There is a major myth that the Intel Atom C2750 is many times slower in single threaded performance than the Intel Xeon E3-1200 series, and AMD Opteron series chips. This is true to some extent. What we are seeing is that the jump from the S1260 to the E3-1230 V2 is over 10x in performance while the C2750 to E3-1230 V3 jump is more like a 2x jump. While the 2x single threaded performance may seem insurmountable, it should be remembered that platform power is less than 2x and the multi-threaded performance picture is also important to look at.
A number of our cloud instances and lower-power server platforms are single core only so they do not have results posted. We again see the Intel Xeon E3-1200 V3 chips best the Intel Atom C2750 by a significant margin from about 1.5x to just under 2x. On the other hand, we see the Intel Atom C2750 actually beat the single Intel Xeon L5520 processor in the multi-threaded benchmark. That is an interesting comparison because it has much closer nominal clock speeds to the 2.4GHz Intel Atom C2750 and shows that the eight physical cores of the Avoton / Rangeley era stand up well to older-generation 4 core/ 8 thread architectures.
Another great comparison point is that in UnixBench multi-threaded tests the Intel Atom C2750 is more than 20x faster than the Intel Atom S1260. This is not a simple product evolution.
c-ray 1.1 Performance
c-ray is a very interesting ray tracing benchmark. It provides both consistent results and some clear separation. Ray tracing is generally a great multi-threaded CPU benchmark. For this test we use both a simple 7500×3500 render and a more complex 1920×1200 render. Here are the results:
The c-ray 1.1 rendering benchmarks are a personal favorite with multi-core processors. The benchmarks scale very well. We use both a simple render and a more complex one to help compare different architectures. In the next year we will likely have to start adding an even more complex render given a dual Intel Xeon E5-2697 V2 system will finish the simple render in just over 800ms and the complex render in under 15 seconds.
We again see the trend with the higher-clock speeds of the Haswell based Intel Xeon E3 chips being a major advantage. Among the other processors on the list, the Intel Atom C2750 holds up very well thanks to this being a very well multi-threaded benchmark. We see the Intel Xeon E3-1200 series chips showing off over a 2x advantage over the C2750. On the other hand, the Intel Atom C2750 completes the render in 86 seconds (1 minute 26 seconds) while the Atom S1260 completes the same render in 1297 seconds (21 minutes 37 seconds).
Crafty Chess Performance
Crafty is a well known chess benchmark. It is also one where we saw issues last time with the Phoronix Test Suite and running on ARM CPUs. Here are the Crafty Chess results from simply running “crafty bench”:
Since this is a single-threaded test, we see the Intel Atom C2750 fall behind many of the other solutions. The Intel Haswell architecture performs very well in these tests. We also see a relative win for the Intel Atom S1260 with the C2750 only about 370% faster in this benchmark. Still, the C2750 sets a completely new bar for performance per watt.
Phoronix Test Suite Performance
We are using four tests from the Phoronix Test Suite: pts/stream, pts/compress-7zip, pts/openssl and pts/pybench.
- STREAM by John D. McCalpin, Ph.D. is a very well known memory benchmark benchmark.
- 7-zip compression benchmarks were a mainstay in our Windows suite so we are including it again on the Linux side as a compression benchmark.
- The pts/openssl benchmark is very dependent on the CPU architecture being used
- Python is a widely used scripting language and pyBench is a nice single-threaded Python benchmark.
Here are the results of the Phoronix Test Suite benchmarks:
This is another part of the benchmark suite that will likely be re-done in the 2014 version. These results are sorted by the 7-Zip benchmark since compression is a normal web server task. Since we have already seen quite a few instances with the same themes, the only pointer here is that the Intel Atom C2750 performance gain over the S1260 generation is again over 10x.
As we saw with the power consumption testing, even with twice as many Gigabit Ethernet ports running and 4x the RAM, the C2750 platform shows only about a 2x power consumption increase under full platform load (including network ports) compared to the S1260 yet very similar idle power consumption figures.
The new Intel Atom C2750 is an amazing performer, especially given its low power consumption characteristics. A new architecture, more cores, more cache, higher clock speeds, AES-NI acceleration and additional memory bandwidth have made an amazing impact on performance. This is a fully revolutionary pass at increasing Atom series performance over older generations. Perhaps the most interesting aspect is that systems can easily be built on the Intel Atom C2750 with local storage and multiple gigabit Ethernet ports that run many nodes in a small footprint. The Supermicro A1SAi-2750F and added Kingston SODIMMs fit in a 1U mITX form factor. That means that with shared cooling and a 2U chassis, one could conceivably create a full compute cluster in less than 1A (at 110v or 120v)/ 1U ratios seen as a key metric for datacenter deployments. At some point, a company will build a 4-5 node cluster of these including a switch and local storage that can fit in a 2U and 2U profile. That type of machine can quickly be deployed as edge POPs or dedicated server clusters inexpensively and with some local redundancy.
Bottom line: the Intel Atom C2750 is a huge step forward for Intel and the low end web hosting market.