Now Reading
Intel Xeon Phi x200 (Knights Landing) – x86 Code Compatibility – It boots Windows!
5

Intel Xeon Phi x200 (Knights Landing) – x86 Code Compatibility – It boots Windows!

by Patrick KennedyDecember 9, 2016

We have had the opportunity to start testing another Intel Xeon Phi x200 Knight’s Landing system, the Supermicro SYS-5038K-I-ES1. The goal of the system is to have a desktop development platform for the Knights Landing processors that is priced under a departmental $5000 procurement threshold. The system comes equipped with a single CPU motherboard, a CoolIT systems liquid cooled Intel Xeon Phi 7210 (64 core) processor and 6x 16GB DDR4 RDIMMs by default.

When we first booted the system we used an Intel DC S3700 400GB SSD which had a Windows Server 2012 R2 installation that had not been updated in over a year. We watched the system boot via the system’s iKVM functionality and it worked without issue. We then tried running a program (Cinebench R15) just to see if it would work. Indeed, Windows saw a 256 thread processor and ran the ray tracing application out-of-the-box.

Supermicro SYS 5038K I ES1 Front Three Quarters

Supermicro SYS 5038K I ES1 Front Three Quarters

We did not do any performance optimizations for this run, and there are many for the KNL system. The run did not utilize any of the KNL features that give the platform its true power such as AVX512. That testing is coming. We did validate one of the major selling points of KNL. Code made to run on other Intel x86 architectures will work without modification on the new Intel Xeon Phi x200 (KNL) generation. The impact of this is enormous and is a key reason we saw so many KNL supercomputer system wins at SC16 this year. Applications can access a highly parallelized architecture without needing a co-processor. The Intel Xeon Phi x200 has direct access to the hex-channel system RAM as well as 16GB of high-bandwidth/ low latency MCDRAM without a performance penalty from traversing the PCIe bus.

Check out the video for the results and another view of the system.

We do have many more useful benchmarks, workloads and information queued on the KNL systems. We did want to show that this system is capable of running legacy x86 code out-of-the-box. This is a question that we have received from readers more than 100 times over the past year so we hope this video is a definitive answer to that question.

Stay tuned for more to come on Knights Landing. In the meantime, you can check out our starter CUDA machine learning/ AI/ deep learning workstation guide. We have a lot of machine learning content hitting STH over the next few months.

About The Author
Patrick Kennedy
Patrick has been running ServeTheHome since 2009 and covers a wide variety of home and small business IT topics. For his day job, Patrick is a management consultant focused in the technology industry and has worked with numerous large hardware and storage vendors in the Silicon Valley. The goal of STH is simply to help users find some information about basic server building blocks. If you have any helpful information please feel free to post on the forums.
5 Comments
  • Colin Stuart
    December 9, 2016 at 1:07 pm

    wow, very cool! Look forward to more results

  • jsp
    December 10, 2016 at 7:34 am

    Outstanding. Cannot wait to see the results.

  • Jesper
    December 11, 2016 at 11:05 am

    Would be really cool if you guys tested various non-HPC related tasks too, well, i mean, stuff it wasn’t really designed to do, like how if would perform on multithreaded image processing like http://www.graphicsmagick.org

    MySQL or MemSQL performance?

    nginx (load balancing and/or webserver) and/or haproxy

    Encoding, h264 and/or h265 etc etc

    Yes, i know it wasn’t ment to do these tasks but still, would be cool to see if it performed well in any of these tasks, well, come to think of it, image processing image processing should be pretty suitable for the Phi?

  • Jordan Viray
    December 12, 2016 at 6:43 pm

    Is the MCDRAM low latency compared to the DDR4? That’s a benchmark metric I’d like to see. It sits very close but given the higher bandwidth, it might not have better latency and wouldn’t be suited for use as a cache in many applications.

Leave a Response