Supermicro Hyper-Speed Server Benchmarks

7
Posted January 2, 2013 by Patrick Kennedy in Servers
Supermicro Hyper-Speed 6027AX-TRF Front

The Supermicro Hyper-Speed server sent to STH for review has been featured several times recently, mostly focusing on features. For those not familiar with the Supermicro Hyper-Speed server line it allows for selected pre-configured components to run faster than their native specifications. For example, the Intel Xeon E5-2687W is currently the fastest CPU at the time of this testing. The Supermicro Hyper-Speed line can run these processors 6% faster than other systems. Likewise while DDR3-1600 may be fast, the Hyper-Speed line runs the components at over 1900MHz, faster than even the DDR3-1866 spec provisions for. The key here is that Supermicro picks parts for the Hyper-Speed line to ensure stability. It does so to validate that the systems will work in data center environments in a reliable fashion.

Test Configuration

Supermicro sent the following test configuration for our testing. This represents one common configuration for a compute node. One other popular configuration is using dual Intel Xeon E5-2637 CPUs (4C/8T) for applications where one needs high clock speed and lower core counts due to per-core license costs.

For years, performance differences in servers with the same memory, storage, and CPUs were less than 1%. The Supermicro Hyper-Speed line changes this producing tangible performance boosts for those looking for every last bit of reliable performance.

Overall, this is the fastest dual socket configuration available in terms of today’s processors. Currently the Intel Xeon E5-2600 series line tops out with Intel Xeon E5-2687W at 3.1GHz base clocks and 8 cores / 16 threads each. We used the settings found in the Hyper-Speed BIOS piece to take the chips to 106 BCLK an increase of 6% over stock.

Supermicro Hyper-Speed Benchmarks

For this piece, we use a few benchmarks to see the impact of Supermicro Hyper-Speed overclocking. The unit was tested at stock clocks (100MHz BCLK and DDR3-1600) as well as at 106MHz BCLK and 1978MHz DDR3 speeds. Since the Hyper-Speed line is meant for applications such as High-Frequency Trading and HPC markets, we also ran some of the benchmarks on other reference architectures.

Folding@Home – GROMACS Protein Folding

GROMACS is a widely used molecular dynamics package. Folding@Home is a fairly mature GROMACS application that can run under different operating systems and is easily obtainable. Speed is often measured in Time Per Frame or TPF. A TPF measurement is the amount of time it takes to complete 1% of the work assigned to the node. Therefore we look for the lowest TPF possible. The results here may vary slightly compared to what one sees on a given work unit. We are using the same work unit across configurations to minimize variances. The test does scale well to 64 logical CPUs and beyond.

Supermicro Hyper-Speed GROMACS Folding

Supermicro Hyper-Speed GROMACS Folding

As one can see, Supermicro’s Hyper-Speed implementation does show a marked improvement over the stock speed dual Intel Xeon E5-2687W configuration. The test scales well with core and clock speed. Memory speed also has some impact on the results so this is a solid showing from the Hyper-Speed line. As an interesting note, Folding@Home uses a logarithmic scale to value contributions, so faster return times equal more “points” that donors can earn. Those serious about helping the Stanford project therefore see a large benefit even from single second faster results.

Crafty Chess

Chess benchmarks are used to simulate complex simulations that calculate probability. The Crafty Chess benchmark has become one of several benchmarks that currently define the space.

Supermicro Hyper-Speed Crafty Chess

Supermicro Hyper-Speed Crafty Chess

He we see that the Hyper-Speed system does show a consistent performance gain over the stock clocked part.

Stream

Stream is a benchmark that needs virtually no introduction. It is considered by many to be the de facto memory performance. Authored by John D. McCalpin, Ph.D. it can be found at http://www.cs.virginia.edu/stream/ and is very easy to use.

Supermicro Hyper-Speed Stream

Supermicro Hyper-Speed Stream

At first, I was in total disbelief after running the four tests ten times on each, throwing out the min/ max and averaging the rest. The Supermicro Hyper-Speed server has a huge advantage in this test due to a few factors, not the least of which is the massively improved memory clock rate. Stock Xeon E5-2600 series CPUs run DDR3 memory at 1600MHz if the memory is capable. The Supermicro Hyper-Speed platform runs the memory at 1978MHz. The result shows in the above graph.

Linpack 11.0.1

Linpack is probably the most well known HPC benchmark. It is used to rate the Top500 supercomputers in the world.

Supermicro Hyper-Speed Linpack

Supermicro Hyper-Speed Linpack

I will note that this is not the bleeding-edge tuned performance numbers that most vendors use. This is representative more of the out of the box OS and Linpack compilation. One can see some significant improvement which is important for those looking for maximum performance.

STH JMeter Transaction Benchmark

High-frequency trading algorithms are fairly difficult to get a hold of since they represent very valuable IP. What we were able to do is run JMeter based test against a local Silicon Valley start-up’s solution. We tend not to use proprietary benchmarks like this, but the simple explanation for what is going on is that the machine has a source server holding a 220GB data set in RAM and a smaller ~30GB local data set that the algorithm then attempts to predict and match requests based on these data sources. Unfortunately, the test chokes on 1GbE and 10GbE so we had to use on-hand Mellanox ConnectX-2 MHQH19B-XTR 40gbps FDR Infiniband cards to string together a high-performance 40gbps network to run the tests. As their offering matures, hopefully we can feature their public solution later in 2013. Instead of using raw numbers, we instead are using the Dual Intel Xeon X5670 as a baseline, then look at transactions serviced per second compared to that baseline.

Supermicro Hyper-Speed JMeter Transactions

Supermicro Hyper-Speed JMeter Transactions Per Second Comaprison

This is an interesting result. It does fall in line with what we would expect given the performance figures mentioned above. For those wondering, this was similar to what we saw with Apache Bench with these machines using our new JMeter based Apache/ PHP transaction benchmark but we wanted to verify using something closer to the proprietary solutions used in the field.

Closing Thoughts

The performance speaks for itself. Performance is simply faster than one can get in competing systems. For years, performance differences in servers with the same memory, storage, and CPUs were less than 1%. The Supermicro Hyper-Speed line changes this producing tangible performance boosts for those looking for every last bit of reliable performance. Although this piece did not delve into power consumption, heat and stability, the Folding@Home tests were done three times at stock speeds, 104 BCLK, 105BCLK and 106BCLK (numbers used above.) Never did voltage need to be increased for the multi-day benchmarking sessions that relentlessly pegged the twin CPUs at 100%.


About the Author

Patrick Kennedy

Patrick has been running ServeTheHome since 2009 and covers a wide variety of home and small business IT topics. For his day job, Patrick is a management consultant focused in the technology industry and has worked with numerous large hardware and storage vendors in the Silicon Valley. The goal of STH is simply to help users find some information about basic server building blocks. If you have any helpful information please feel free to post on the forums.

7 Comments


  1.  
    Elliswoth, J

    Fast! Going to do more of your traditional Windows benchmarks also?




  2.  
    dba

    I am most impressed by the STREAM results. 96GB/S is extremely good for a two-CPU setup. By the way, are you providing STREAM Copy or Triad results?




  3.  
    Teddy

    I can see why this is great for those that need the performance.




  4.  
    Laugh|nGMan

    My UP X9SRL-F E5-1620 + WIN2012 + 4×16 Gb DDR3 1600 ECC REG KVR16R11D4/16 managed in STREAM only ~35,6 – 38,3 Gb/s [http://i.imgur.com/ZPjo4.jpg]. SiSoftware Sandra Business 2013 similar results ~ 37 Gb/s [http://i.imgur.com/yNEUF.jpg], so i little bit pissed, expected to hit half of what hp DP e5-2600 xeons shows… ~ 87,7 Gb/s… according to HP documents.




  5.  
    typefirst

    Two processors should give double right? laughingman you are probe about right




  6.  
    zakari

    Thanks for the great article.

    Are you by any chance planning to take some ‘actual’ measurements of power consumption (from the wall socket) ?

    I cant really imagine how the same CPU running faster cannot consume a little more.





Leave a Response

(required)


Newly Reviewed
 
  • Purch Anandtech
  • Seagate 1200 v Toshiba PX02SMF040 400GB ATTO Write Benchmark
  • Intel S3700 and 710
  • pfSense Hyper-V 2.2-RC
  • Toshiba PX02SMF080 800GB
  • Amazon AWS Logo2