Inspur Systems NF5468M5 Review 4U 8x GPU Server

March 5, 2019

Inspur Systems NF5468M5 Test Configuration

Our Inspur Systems NF5468M5 test configuration was robust and similar to what we expect to see at many hyperscalers and cloud service providers using this type of system:

System: Inspur Systems NF5468M5
CPU: 2x Intel Xeon Gold 6130
GPU: 8x NVIDIA Tesla V100 32GB PCIe
RAM: 768GB (24x 32GB) DDR4-2666 at 2666MHz
OS SSD: Intel DC S3520 240GB
NVMe SSD: 4x Intel DC P4600 3.2TB
RAID SATA SSD: 8x Intel DC S4500 960GB
RAID Controller: Broadcom (LSI) 9460-8i
1GbE NIC: Intel i350-am4
10GbE NIC: Intel X722 Onboard Dual SFP+ 10GbE
25GbE NIC: Mellanox ConnectX-4 Lx dual-port 25GbE
100GbE NIC: Mellanox ConnectX-5 VPI dual-port 100Gbps EDR InfiniBand and 100GbE

We swapped 25GbE and 100GbE NICs as well as some of the CPUs in the system to customize the solution a bit for our lab’s needs. In a system like this, the GPU costs tend to dominate the overall configuration.

Overall with fast networking, plenty of memory, and plenty of local NVMe storage, these are high-end systems. They are also a step beyond the previous-generation Intel Xeon E5 V4 based systems utilizing additional PCIe lanes for NVMe storage. This is an area where the newer Intel Xeon Scalable platform scales better than its predecessor.

Inspur NF5468M5 Topology

Topology in modern servers is a big deal. The topology dictates performance, especially when I/O is taxed in heavily NUMA domain sensitive workload. Here is the overall system topology of the Inspur NF5468M5:

Adding in GPUs, and excluding any Mellanox NICs capable of GPUdirect RDMA here is the system topology from nvidia-smi:

Inspur NF5468M5 NVIDIA Topology Without NICs

This is a dual root system. You can see that GPU0 – GPU3 are on one PCIe switch off of the first CPU. GPU4 – GPU7 connect to a different PCIe switch on the second CPU.

These systems are designed to be used not just as single servers, but also in clusters. Mellanox invested heavily in both RDMA for Infiniband and Ethernet which helps immensely in GPU-to-GPU communication. Here, we can see a common topology for this class of system.

Inspur NF5468M5 8x Tesla V100 Plus Two Mellanox ConnectX 5 Topology

One PCIe x16 slot for networking is found attached to each PCIe switch. That allows GPUs to communicate over the network directly, without having to go through the CPU and UPI bus. As you scale out with PCIe-based GPU deep learning systems, this is the popular system topology.

Next, we are going to take a look at the Inspur Systems NF5468M5 management interface before moving onto our performance section.

9 COMMENTS

Chet Reed March 5, 2019 At 4:43 pm

Ya’ll are doing some amazing reviews. Let us know when the server is translated on par with Dell.
Juno Shi March 5, 2019 At 8:50 pm

How wonderful this product review is! So practical and justice!
Tomas R March 5, 2019 At 10:38 pm

Amazing. For us to consider Inspur in Europe English translation needs to be perfect since we have people from 11 different first languages in IT. Our corporate standard since we are international is English. Since English isn’t my first language I know why so early of that looks a little off. They need to hire you or someone to do that final read and editing and we would be able to consider them.

The system looks great. Do more of these reviews
Misha Engel March 6, 2019 At 5:51 am

Thanks for the review, would love to see a comparison with MI60 in a similar setup.
Rod Howard March 6, 2019 At 6:15 am

Great review! This looks like better hardware than the Supermicro GPU servers we use.
Matthias Wolf March 6, 2019 At 7:43 am

Can we see a review of the Asus ESC8000 as well? I have not found any other gpu compute designer that offers the choice in bios between single and dual root such as Asus does.
Patrick Kennedy March 6, 2019 At 9:00 am

Hi Matthias – we have two ASUS platforms in the lab that are being reviewed, but not the ASUS ESC8000. I will ask.
Misha Engel March 9, 2019 At 5:25 am

How is the performance affected by CVE‑2019‑5665 through CVE‑2019‑5671and CVE‑2018‑6260?
Jerry June 3, 2019 At 11:08 am

P2P bandwidth testing result is incorrect, above result should be from NVLINK P100 GPU server not PCIE V100.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

REVIEW OVERVIEW
Design & Aesthetics 9.3
Performance 9.7
Feature Set 9.6
Value 9.5
SUMMARY Our Inspur Systems NF5468M5 review shows how this 4U 8x NVIDIA Tesla V100 32GB server compares to other offerings on the market and performs	9.5 OVERALL SCORE

Inspur Systems NF5468M5 Test Configuration

Inspur NF5468M5 Topology

RELATED ARTICLESMORE FROM AUTHOR

Supermicro NVIDIA GB200 NVL72 System at Computex 2024

Tenstorrent Wormhole Developer Kits Launched

AMD Ryzen AI 300 Series Launched

9 COMMENTS

LEAVE A REPLY

RELATED ARTICLES MORE FROM AUTHOR