A New PCIe Switch Option: The Microsemi Switchtec PCIe Gen3 Switch

1
Microsemi Switchtec PCIe switch architecture
Microsemi Switchtec PCIe switch architecture

After Flash Memory Summit 2016 we published a piece about how the PLX acquisition impacted NVMe adoption. (See Business side of PLX acquisition: Impediment to NVMe everywhere.) Shortly thereafter we were contacted by Microsemi, a company specifically targeting NVMe storage platforms with a new line of PCIe switch products. Server companies are looking at alternatives to bring larger NVMe array solutions to the marketplace. While the dual port NVMe market is still in its infancy, we expect that HA Active-Active NVMe storage solutions will drive demand for larger NVMe arrays. Here is some information we gathered on the Microsemi Switchtec PCIe switch line that is hitting the market.

Background: How the Market is Shifting Away from SATA

At Flash Memory Summit 2015 we heard that the enterprise SSD market was shifting from SATA SSDs to SAS and PCIe. The reason for this simply is that large arrays are all SAS and when it comes down to BOM cost, SAS is marginally more expensive than SATA. SAS3 is also a significantly higher performing interface than SATA III and includes features for dual port drives that SATA does not have. Microsemi cited an IDC report on the market share of PCIe, SAS and SATA enterprise SSDs to show where the market is moving.

Microsemi SAS PCIe SATA Forecast IDC
Microsemi SAS PCIe SATA Forecast IDC

The bottom line here is that SAS and PCIe are set to turn SATA into a minority interface by 2018. While SAS will grow into the traditional SAS architecture storage solutions, we expect PCIe (and NVMe) to dominate arrays in 2017. We expect investment in Intel’s Purley platform to generate new designs for dual port NVMe drives. We already see some drives coming to market such as the Intel DC D3000 series. We have also seen systems from the ODMs such as AIC, QCT, Tyan, Supermicro and others to support Active-Active configurations using dual port NVMe drives. Although vendors cite PLX switch chip pricing as the key inhibitor to adoption, we are seeing platforms available to support the market shift away from SATA.

Background 2: The PCIe Lane Problem

PCIe lanes in current generation Intel platforms are challenging for larger NVMe arrays. An Intel Xeon D processor (used in lower-end arrays) has 32 PCIe lanes. An Intel Xeon E5-1600 V4/ E5-2600 V4/ E5-4600 V4 chip has 40 lanes per processor so 1, 2 and 4 CPU systems have at most 40, 80, and 160 PCIe 3.0 lanes natively. Adding one or two NVMe drives is not an issue but larger arrays, such as 24x PCIe x4 drives in a system present a PCIe lane shortage. 24 drives using 4 lanes each requires 96 lanes of PCIe 3. Furthermore, when one has 24x NVMe drives they are likely to want higher-speed networking which requires additional PCIe lanes. For example, the 40GbE adapters we use in our lab are all PCIe 3.0 x8 devices. 100GbE will require a PCIe 3.0 x16 host interface.

Supermicro 4028GR-TR PCIe Backplane
Supermicro 4028GR-TR PCIe Backplane

The bottom line is that a typical 24-bay single-port storage server likely will need over 100 PCIe 3.0 lanes even with a single 100GbE network adapter. Beyond this, PCIe hosts can only handle so much bifurcation. That is one reason why we do not see Intel Xeon D-1500 platforms with 16x PCIe 3.0 x2 slots or 8x PCIe 3.0 x4 slots. If you see the term Non-Transparent Bridge (NTB) thrown around, Active-Active storage servers also require a high bandwidth, low latency data path to communicate. NTB is that path which we first saw on PCIe devices over a decade ago.

To circumvent the shortage of PCIe lanes and service NTB needs, server vendors typically use PCIe switches. That allows higher-levels of PCIe bifurcation. At a high-level, this operates much like having a 40GbE uplink port or two on a 24 port 10GbE switch. You can connect more devices to fewer upstream ports and oversubscribe the up-links to better utilize them. PLX has largely been the only game in town for server vendors but Microsemi is hoping to change this with its new Switchtec products.

The Microsemi Switchtec PCIe Switch

To address the needs of the growing PCIe/ NVMe storage market, Microsemi has a new Switchtech product on the market. Based on a 28nm process the Microsemi Switchtec PFX and PSX families of PCIe switches are targeted at storage applications. We are excited to see these products on the market ahead of next year’s major hardware design cycle. The company offered STH a briefing with their product team and we had a few key features that we thought we would share. Perhaps the biggest impact is going to be the market impact of having a choice in suppliers but we saw several key features of the Microsemi product that we believe will offer points of differentiation.

Microsemi Switchtec Highlights
Microsemi Switchtec Highlights

Key Feature #1: Microsemi Switchtec PSX Programmable SoC

The Microsemi Switchtec PSX is particularly interesting as Microsemi is giving server vendors access to the switch’s SoC. The company says that there are applications for this SoC access such as catching errors (e.g. if a NVMe device fails) that could potentially blue screen a system at the disk shelf level. For storage vendors, this access should be appealing. This API level access to the SoC is the major difference between the PSX and PFX series of PCIe switches. We do see the feature as being a major differentiation point for those vendors who want to build disk shelves that are more advanced in their functionality.

Microsemi Switchtec Highlights
Microsemi Switchtec PFX PSX Family

Key Feature #2: Bifurcation

Microsemi Switchtec can handle PCIe bifurcation down to x2. What that practically means is that on a 96 port PCIe switch you can have 48 x2 devices. That is unlikely to happen since you will have larger uplinks. At the same time, with the expected explosion in dual port NVMe SSD offerings on the horizon, this allows Microsemi to use fewer PCIe switch chips than common current-generation PLX designs. Fewer switch chips mean lower power consumption, lower complexity and better performance.

Microsemi Switchtec Fewer Switches
Microsemi Switchtec Fewer Switches

Key Feature #3: Choice

This one is not on Microsemi’s marketing slides but the impact on the market will be clear, ODMs/ OEMs will have a choice going forward on PCIe switch chips. The fact that there is another option in the market will be good. Microsemi told us their switch chips are within 10ns latency wise of the PLX designs which the company maintains is negligible, especially if you can use fewer switch chips in your design. We have already heard from ODMs/ OEMs that they are excited about there being a choice in the near future.

Microsemi Switchtec PCIe switch architecture
Microsemi Switchtec PCIe switch architecture

Final Words

Time will tell if Microsemi is successful with their new 28nm PCIe switch chip but it seems like they have a solid feature set. They are also entering a market where ODMs/ OEMs are clearly fatigued by pricing practices and a general lack of options. To users, this is technology we expect to be largely transparent. We have now noted in our editorial calendar that in 2017 we will try to get our hands on both PLX and Microsemi designs and run benchmarks on both platforms. We may also setup a side-by-side comparison in our DemoEval service if the opportunity presents itself. There is much work for us to do in the future.

1 COMMENT

  1. Had to snicker a bit at the mention of blue screens… It’s not like a lot of people run their storage platforms on Windows (I hope).

LEAVE A REPLY

Please enter your comment!
Please enter your name here