Zephyrnet Logo

AI Needs Enormous Computing Power. Could Light-Based Chips Help? | Quanta Magazine

Date:

Introduction

Moore’s law is already pretty fast. It holds that computer chips pack in twice as many transistors every two years or so, producing major jumps in speed and efficiency. But the computing demands of the deep learning era are growing even faster than that — at a pace that is likely not sustainable. The International Energy Agency predicts that artificial intelligence will consume 10 times as much power in 2026 as it did in 2023, and that data centers in that year will use as much energy as Japan. “The amount of [computing power] that AI needs doubles every three months,” said Nick Harris, founder and CEO of the computing-hardware company Lightmatter — far faster than Moore’s law predicts. “It’s going to break companies and economies.”

One of the most promising ways forward involves processing information not with trusty electrons, which have dominated computing for over 50 years, but instead using the flow of photons, minuscule packets of light. Recent results suggest that, for certain computational tasks fundamental to modern artificial intelligence, light-based “optical computers” may offer an advantage.

The development of optical computing is “paving the way for breakthroughs in fields that demand high-speed and high-efficiency processing, such as artificial intelligence,” said the University of Cambridge physicist Natalia Berloff.

Optimal Optical

In theory, light provides tantalizing potential benefits. For one, optical signals can carry more information than electrical ones — they have more bandwidth. Optical frequencies are also much higher than electrical ones, so optical systems can run more computing steps in less time and with less latency.

And then there’s the efficiency problem. In addition to the environmental and economic costs of relatively wasteful electronic chips, they also run so hot that only a tiny fraction of the transistors — — the tiny switches at the heart of all computers — can be active at any moment. Optical computers could, in theory, run with more operations taking place simultaneously, churning through more data while using less energy. “If we could harness” these advantages, said Gordon Wetzstein, an electrical engineer at Stanford University, “this would open a lot of new possibilities.”

Introduction

Seeing the potential advantages, researchers have long tried to use light for AI, a field with heavy computational needs. In the 1980s and 1990s, for instance, researchers used optical systems to build some of the earliest neural networks. Demetri Psaltis and two colleagues at the California Institute of Technology created a clever facial recognition system using one of these early optical neural networks (ONNs). They stored images of a subject — one of the researchers, in fact — as holograms in a photorefractive crystal. The researchers used the holograms to train an ONN, which could then recognize new images of the researcher and distinguish him from his colleagues.

But light also has shortcomings. Crucially, photons generally don’t interact with each other, so it’s hard for one input signal to control another signal, which is the essence of what ordinary transistors do. Transistors also work exceptionally well. They’re now laid down on coin-size chips by the billion, the products of decades of incremental improvements.

But in recent years, researchers have found a killer app for optical computing: matrix multiplication.

Some Light Math

The process of multiplying matrices, or arrays of numbers, undergirds a lot of heavy-duty computing. In neural networks, specifically, matrix multiplication is a fundamental step both in how networks are trained on old data and in how new data is processed in trained networks. And light just might be a better medium for matrix multiplication than electricity.

This approach to AI computation exploded in 2017, when a group led by Dirk Englund and Marin Soljačić of the Massachusetts Institute of Technology described how to make an optical neural network built on a silicon chip. The researchers encoded the various quantities they wanted to multiply into beams of light, then sent the beams through a series of components that altered the beam’s phase — the way its light waves oscillated — with each phase alteration representing a multiplication step. By repeatedly splitting the beams, changing their phase, and recombining them, they could make the light effectively carry out matrix multiplication. At the end of the chip, the researchers placed photo detectors that measured the light beams and revealed the result.

Introduction

The researchers taught their experimental device to recognize spoken vowels, a common benchmark task for neural networks. With the advantages of light, it could do so faster and more efficiently than an electronic device. Other researchers had known that light had the potential to be good for matrix multiplication; the 2017 paper showed how to put it into practice.

The study “catalyzed massive, renewed interest in ONNs,” said Peter McMahon, a photonics expert at Cornell University. “That one has been super influential.”

Bright Ideas

Since that 2017 paper, the field has seen steady improvement, as various researchers have come up with new kinds of optical computers. Englund and several collaborators recently unveiled a new optical network they call HITOP, which combines multiple advances. Most importantly, it aims to scale up the computation throughput with time, space and wavelength. Zaijun Chen, a former MIT postdoc now based at the University of Southern California, said this helps HITOP overcome one of the drawbacks of optical neural networks: It takes significant energy to transfer data from electronic components into optical ones, and vice versa. But by packing the information into three dimensions of light, Chen said, it shoves more data through the ONN faster and spreads the energy cost over many calculations. This drives down the cost per calculation. The researchers reported that HITOP could run machine learning models 25,000 times larger than previous chip-based ONNs.

To be clear, the system is still far from matching its electronic predecessors; HITOP performs about 1 trillion operations per second, whereas sophisticated Nvidia chips can chug through 300 times as much data, said Chen, who hopes to scale up the technology to make it more competitive. But the optical chip’s efficiency is compelling. “The game here is that we lowered the energy cost 1,000 times,” Chen said.

Other groups have created optical computers with different advantages. Last year, a team at the University of Pennsylvania described a new kind of ONN that offers unusual flexibility. This chip-based system shines a laser onto part of the semiconductor that makes up the electronic chip, which changes the semiconductor’s optical properties. The laser effectively maps the route for the optical signal to take — and hence the calculation it performs. This lets the researchers easily reconfigure what the system does. This is a stark difference from most other chip-based systems, optical and electric, where the route is laid down carefully in the fabrication plant and is very hard to change.

“What we have here is something incredibly simple,” said Tianwei Wu, the study’s lead author. “We can reprogram it, changing the laser patterns on the fly.” The researchers used the system to design a neural network that successfully discriminated vowel sounds. Most photonic systems need to be trained before they’re built, since training necessarily involves reconfiguring connections. But since this system is easily reconfigured, the researchers trained the model after it was installed on the semiconductor. They now plan to increase the size of the chip and encode more information in different colors of light, which should increase the amount of data it can handle.

It’s progress that even Psaltis, who built the facial recognition system in the ’90s, finds impressive. “Our wildest dreams of 40 years ago were very modest compared to what has actually transpired.”

First Rays of Light

While optical computing has advanced quickly over the past several years, it’s still far from displacing the electronic chips that run neural networks outside of labs. Papers announce photonic systems that work better than electronic ones, but they generally run small models using old network designs and small workloads. And many of the reported figures about photonic supremacy don’t tell the whole story, said Bhavin Shastri of Queen’s University in Ontario. “It’s very hard to do an apples-to-apples comparison with electronics,” he said. “For instance, when they use lasers, they don’t really talk about the energy to power the lasers.”

Lab systems need to be scaled up before they can show competitive advantages. “How big do you have to make it to get a win?” McMahon asked. The answer: exceptionally big. That’s why no one can match a chip made by Nvidia, whose chips power many of the most advanced AI systems today. There is a huge list of engineering puzzles to figure out along the way — issues that the electronics side has solved over decades. “Electronics is starting with a big advantage,” said McMahon.

Some researchers think ONN-based AI systems will first find success in specialized applications where they provide unique advantages. Shastri said one promising use is in counteracting interference between different wireless transmissions, such as 5G cellular towers and the radar altimeters that help planes navigate. Early this year, Shastri and several colleagues created an ONN that can sort out different transmissions and pick out a signal of interest in real time and with a processing delay of under 15 picoseconds (15 trillionths of a second) — less than one-thousandth of the time an electronic system would take, while using less than 1/70 of the power.

But McMahon said the grand vision — an optical neural network that can surpass electronic systems for general use — remains worth pursuing. Last year his group ran simulations showing that, within a decade, a sufficiently large optical system could make some AI models more than 1,000 times as efficient as future electronic systems. “Lots of companies are now trying hard to get a 1.5-times benefit. A thousand-times benefit, that would be amazing,” he said. “This is maybe a 10-year project — if it succeeds.”

spot_img

Latest Intelligence

spot_img