One of the major challenges with co-packaged optics switches is the reliability. If a pluggable optic in a switch is unreliable, it can be replaced in seconds. On the other hand, if a switch with co-packaged optics is deemed unreliable, then it can require pulling an entire switch and doing some internal switch work. Broadcom shared results from a hyper-scale study on the Tomhawk 5 CPO solution TH5-Bailly to show that CPO is now reliable enough to be deployed.
Broadcom CPO Reliability Testing with TH5-Bailly
In an AI training cluster, links staying up means AI accelerators can coordinate longer without interruption. That directly translates to improved utilization of expensive AI clusters doing useful work. Part of this is the MTBF with CPU showing 2.5M hour MTBF.

Here is a look at the test setup.

As part of this, the Broadcom Fiber Connector or BFC is one of the key enablers. Just qualifying that connector and showing that it can survive being deployed reliability is a big part of the journey. That even includes for things like dust since dust getting into these connectors can cause blockages in the optical links.

The annual link failure rate is better, but the other interesting one here is that under lab stress the dual 400G FR4 optics have significantly higher failure rates than the CPO.

Broadcom says that no link flaps were observed in the first 1 million CPO device hours.

The power from the CPO solution is lower than pluggable modules as well.

That may not seem like a big deal, but when each optical module can use 12W+ and there are large numbers of them bing on both sides of links in clusters of hundreds of thousands of AI accelerators, the power savings can be enormous.
Final Words
We are probably a generation (448G SerDes) or so from folks pushing CPO as almost mandatory. Given the power savings, and if the reliability can prove to be higher, it is likely we are going to see CPO AI clusters in the next few quarters. This is not necessarily neat because it is a new CPO solution. Still, one of the major sticking points for CPO adoption has been reliability and serviceability. Just being able to show that a CPO solution is not just equal, but more reliable to pluggable optics is a big win. There is still a lot more that goes into these systems, but CPO not just for switches, but for entire AI clusters is going to be a key enabler for huge clusters that are being designed today.




My mom always told me to get more fibre, and I’ve been following this advice ever since!
On a serious note, that’s a very impressive setup! It’s always cool to read about the high-end stuff here, even though I am even more grateful for the articles focusing on the lower-budget stuff, such as the excellent 2.5 Gbps (and more and more 10G as well) switch review series. Keep up the good work!
I wonder how they’re defining “link failure” for the dual pluggable setups to find that a second optic increases MTBF, but by less than 2x.
A NIC with redundant connections through its two ports shouldn’t be too disrupted if one optic goes down. So I’d think of the reliability of dual QSFP like the reliability of a RAID1 array: it’s not defined by how often some component fails, but how often multiple components fail and make the system stop working before you can do repairs to get back to full health. Assuming you fix individual failed parts relatively quickly the overall system’s uptime can end up quite good.
That’s if you’re only relying on the two links for redundancy. Alternatively, if a node *does* need both optics working for you to call it ‘up’ (say, your use case needs the combined bandwidth of the two optics), I’d expect dual optics to decrease reported MTBF, not increase it at all.