Concluding Hot Chips 32 was perhaps the most profound talk. The Lightmatter Mars chip aims to do AI inferencing computation by modulating the wavelength of light. This optical computation means that the calculations can be done fast with the majority of power being consumed outside of the optical array performing the calculation.
To be clear, this is probably not the technology that will be in your data center in 2020/ 2021, but it may be the most profound technology challenging an industry norm presented similar to Carbon Nanotube NRAM Exudes Excellence in Persistent Memory at Hot Chips 30 or Cerebras Wafer Scale Engine AI chip is Largest Ever at Hot Chips 31.
Lightmatter Mars for AI Inferencing
Taking a step back, the question is why would one use silicon photonics instead of traditional transistors for AI inferencing. The basic idea here is that light travels faster, scales better, and uses less power. The challenge, of course, is that there are decades of doing computation using electric transistors.
Perhaps the biggest savings derives from transport. Since an optical chip is trying to modulate light rather than turn on and off a charge gate, and for chip distances the amount of electrical loss is nearly zero, the computation portion of the chip can be made very efficient. Indeed, the majority of the power is used to convert digital signals to lasers then to read the output and convert it at the other end. A way to think about it is if you connected two switch ports using QSFP28 lasers, and the cable connecting the switches was somehow performing calculations, then the biggest power cost would be the QSFP28 lasers.
These systems provide observation of phase shift through interference. Effectively by observing how light changes through the structure, one can see the results of the calculation.
There are a number of ways to operate phase shifters. Thermal phase shifters are usually slow, in the KHz range. P/N junctions are large but commonly used in high-end optics. A nano optical electro mechanical system (NOEMS) uses a small amount of charge to move the waveguide. This is what Lightmatter uses. This provides low loss and the static power requirements are nearly zero. Capacitance is very small. This works in the 100’s of MHz speed so the Mars chip operates at 1GHz.
Lightmatter uses directional directional couplers means that one gets a 2×2 matrix multiplied by a 1×2 vector. Effectively Lightmatter is using more than a simple MZI shown above.
These arrays are built into larger arrays. This is just a small setup but they can be built into 1000’s of MZIs. The company said that it believes these are being manufactured in a reliable manner.
DAC on the side of the square encodes data on one side of the array. Then it is detected and re-converted to digital on the other side. 64 DACs and ADC mean one gets 4096 MAC operations. The power scales with the square root of the area because most of the power is for the conversion on either side. This is different than classic chips.
The design can multiplex into different wavelengths of light which increases the ability to perform more computation per cycle.
The Lightmatter Mars Photonics Core operates at 1GHz which is mostly driven by how fast it can modulate the NOEMS charge.
Not everything makes sense to do with light. Photonics is the core, and this is the SoC around it. It is small but has a 30MB of SRAM for the cache. That is not enough memory to run huge models, but it is enough for smaller models. The company said during the presentation that it is looking at a larger memory. It said 4-stack HBM3 will not get the bandwidth they need.
The SoC holds weights next to the weight DACs to minimize data movement.
The photonics array acts as the ALU. Some of the other tasks that do not go to the Photonics MAC array.
Large batch sizes mean less data conversion. As a result, the chip becomes more efficient.
Most of the power is used by the digital side. The 3W TDP includes the digital as well as laser power. This is an all-in 3W TDP.
Using 3D integration, there is less than 1mm of routing. That saves even more power because data does not have to traverse a huge distance to travel between chips.
Lightmatter is building hooks to integrate its chips into popular AI software frameworks.
This is a picture of the Mars development board.
Right now, the Mars chips are back and in the lab. Let us be clear if they work, and they have a path to scale, then Lightmatter is going to get purchased for a lot of money.
Even though this was the last presentation of Hot Chips 32, it was perhaps the most profound. While other companies were making alternatives to NVIDIA GPUs for AI chips, Lightmatter is making something uniquely different.
That’s incredible. I especially like this quote:
“A way to think about it is if you connected two switch ports using QSFP28 lasers, and the cable connecting the switches was somehow performing calculations, then the biggest power cost would be the QSFP28 lasers.”
I have to re-read this bit again: “and the cable connecting the switches was somehow performing calculations”. Incredible.
The slide on parallel processing is equally impressive. “Single instruction multiple data” type parallel processing, but, the “instruction” is basically a series of prisms and mirrors? Such that simultaneous differing wavelengths are mangled by the instructions in the way intended, allowing it to operate on all of these wavelengths at the same time. And somehow perform computations with this. Wow.
Must be a weird sight … having the big heatsinks placed the ADC/DAC’s instead of on the brain.
Fantastic! It’s like something out of Sci-Fi stories of AI brains based on light, but actually, in many ways, it is Neural nets coming full circle. The first Perceptrons were analog Neural Nets, and remarkably capable for their time:
Hmm.. The energy and performance numbers presented were worse than what can be done with conventional digital electronics. Why is this interesting at all? Why are people spending time on this?
People said the said thing about the first SSDs and look where we are now. You can’t just look at the raw numbers of an immature technology compared to a very mature technology and conclude the entire venture is worthless. I think the idea here could be interesting and it might pan out to be a disruptor in the long term. You have to look at it like https://thinkingscifi.files.wordpress.com/2016/12/s-curve.jpg
It’s kind of silly seeing an optical strand coupled on the side of the package when they should be integrated beside it. Might still be cheaper to do and align rather than going full PCB integrated, though years ago when I was with Compaq/HP, TI was working on a coupling that worked that way.
A year later they have a 4U server and some benchmarks comparing to the NVidia DGX-A100: https://lightmatter.co/products/envise/