Tips

Hyper-V Replication with Mellanox 40GbE: Greatly improved

October 29, 2015

Mellanox ConnectX-3 VPI Impact on Wordpress during Hyper-V Replication

Recently we were in the process of decommissioning a few Hyper-V nodes we previously used to host STH VMs. When we were running this setup we were experiencing some significant intermittent latency on the CentOS VMs and the root cause was not readily apparent from MariaDB, nginx or the rest of the web stack. We ended up changing our replication network from a dedicated 1GbE NIC to a Mellanox ConnectX-3 VPI NIC in 40GbE mode. That change had a noted impact on the site’s performance significantly reducing service degradation during those periods.

Test setup

Each node was setup identically:

CPU: Dual Intel Xeon E5-2670 (V1)
RAM: 128GB DDR3 RDIMM (8x 16GB)
SSDs: Intel 710 (OS) and Pliant / SanDisk LB406m
HDs: Western Digital Red 4TB
1GbE Networking: Intel i350 based
Other Networking: Mellanox ConnectX-3 VPI onboard (FDR IB/ 40GbE)
OS: Microsoft Hyper-V Server 2012 R2
Guest VMs: Ubuntu 13.10 64-bit – approximately 20 web hosting related VMs and related database VMs.

Overall the hypervisors were rarely over 30-50% CPU load due to extensive caching.

Supermicro SYS-6027TR-D71FRF Motherboard Tray — Supermicro 2U Twin with onboard Mellanox

The impact of changing networks

Since our setup had each Hyper-V node replicating to the alternate node on a 10 min basis, there was significant replication traffic. We were using a 1GbE link as our replication and link to backup storage, leaving the other 1GbE link for non-storage tasks (e.g. serving web pages.)

We first changed the Mellanox ConnectX-3 VPI cards to Ethernet mode. From our guide on how to change Mellanox VPI cards to Ethernet mode in Windows this is extremely easy to do:

Mellanox ConnectX-3 VPI Change Ports from IB to Ethernet

The next step was adding a DAC between the two nodes and setting relevant network information.

After some time, our average performance improved, significantly and we saw significantly less maximum latency. We took the time stamps of the main STH WordPress site’s response times against the start point of replication jobs. We then aligned the timings by a few seconds to try overlaying the two jobs in a scatter plot showing the by second average and maximum response times during the replication on that particular VM. To get a similar size replication, we used a checkpoint of the VM to let replication run once on 1GbE pre-switchover and one on 40GbE post switch over. This is not as scientific as reproducing in a lab environment, however, it is real production data. Here is what the data looked like when we did the analytics on the log files:

The time series plots were in seconds, and we did try to align the start of the latency spikes purposefully to make the data easy to interpret. The major impact here is clear. The maximum response times of our requests were much lower and there was a shorter period of disruption during the replication. The period of service degradation did not decrease by 97%+ as one might think adding a 40% faster pipe might do so there is certainly an impact of other components here, but it did move the periods of degradation down significantly.

Since then we have upgraded our back-end networks in our primary hosting datacenters to 10GbE and our new Sunnyvale test lab to 10GbE and 40GbE. We also changed to Linux with faster local storage, processors, memory and several other upgrades to the application layer. This was done in early 2014 and much (everything) has changed.

Hyper-V Replication with Mellanox 40GbE: Greatly improved

Test setup

The impact of changing networks

1 COMMENT

LEAVE A REPLY

Test setup

The impact of changing networks

RELATED ARTICLESMORE FROM AUTHOR

Intel X553 Networking and Proxmox VE 8.1.3

Intel DOWNFALL Ultra-Scary AVX2 and AVX-512 Side channel Attack Discovered

New Inception Vulnerability Impacts ALL AMD Zen CPUs Yikes

1 COMMENT

LEAVE A REPLY

RELATED ARTICLES MORE FROM AUTHOR