Logo Zephyrnet

Discovering novel algorithms with AlphaTensor

Data:

First extension of AlphaZero to mathematics unlocks new possibilities for research

Algorithms have helped mathematicians perform fundamental operations for thousands of years. The ancient Egyptians created an algorithm to multiply two numbers without requiring a multiplication table, and Greek mathematician Euclid described an algorithm to compute the greatest common divisor, which is still in use today. 

During the Islamic Golden Age, Persian mathematician Muhammad ibn Musa al-Khwarizmi designed new algorithms to solve linear and quadratic equations. In fact, al-Khwarizmi’s name, translated into Latin as Algoritmi, led to the term algorithm. But, despite the familiarity with algorithms today – used throughout society from classroom algebra to cutting edge scientific research – the process of discovering new algorithms is incredibly difficult, and an example of the amazing reasoning abilities of the human mind. 

În nostru hârtie, published today in Nature, introducem AlphaTensor, the first artificial intelligence (AI) system for discovering novel, efficient, and provably correct algorithms for fundamental tasks such as matrix multiplication. This sheds light on a 50-year-old open question in mathematics about finding the fastest way to multiply two matrices.

This paper is a stepping stone in DeepMind’s mission to advance science and unlock the most fundamental problems using AI. Our system, AlphaTensor, builds upon AlphaZero, an agent that has shown superhuman performance on board games, like chess, Go and shogi, and this work shows the journey of AlphaZero from playing games to tackling unsolved mathematical problems for the first time. 

Înmulțirea matricelor

Matrix multiplication is one of the simplest operations in algebra, commonly taught in high school maths classes. But outside the classroom, this humble mathematical operation has enormous influence in the contemporary digital world and is ubiquitous in modern computing. 

Example of the process of multiplying two 3×3 matrices.

This operation is used for processing images on smartphones, recognising speech commands, generating graphics for computer games, running simulations to predict the weather, compressing data and videos for sharing on the internet, and so much more. Companies around the world spend large amounts of time and money developing computing hardware to efficiently multiply matrices. So, even minor improvements to the efficiency of matrix multiplication can have a widespread impact.

For centuries, mathematicians believed that the standard matrix multiplication algorithm was the best one could achieve in terms of efficiency. But in 1969, German mathematician Volken Strassen shocked the mathematical community by showing that better algorithms do exist.

Standard algorithm compared to Strassen’s algorithm, which uses one less scalar multiplication (7 instead of 8) for multiplying 2×2 matrices. Multiplications matter much more than additions for overall efficiency.

Through studying very small matrices (size 2×2), he discovered an ingenious way of combining the entries of the matrices to yield a faster algorithm. Despite decades of research following Strassen’s breakthrough, larger versions of this problem have remained unsolved – to the extent that it’s not known how efficiently it’s possible to multiply two matrices that are as small as 3×3. 

In our paper, we explored how modern AI techniques could advance the automatic discovery of new matrix multiplication algorithms. Building on the progress of human intuition, AlphaTensor discovered algorithms that are more efficient than the state of the art for many matrix sizes. Our AI-designed algorithms outperform human-designed ones, which is a major step forward in the field of algorithmic discovery. 

Procesul și progresul automatizării descoperirii algoritmice

First, we converted the problem of finding efficient algorithms for matrix multiplication into a single-player game. In this game, the board is a three-dimensional tensor (array of numbers), capturing how far from correct the current algorithm is. Through a set of allowed moves, corresponding to algorithm instructions, the player attempts to modify the tensor and zero out its entries. When the player manages to do so, this results in a provably correct matrix multiplication algorithm for any pair of matrices, and its efficiency is captured by the number of steps taken to zero out the tensor.

This game is incredibly challenging – the number of possible algorithms to consider is much greater than the number of atoms in the universe, even for small cases of matrix multiplication. Compared to the game of Go, which remained a challenge for AI for decades, the number of possible moves at each step of our game is 30 orders of magnitude larger (above 1033 for one of the settings we consider).

În esență, pentru a juca bine acest joc, trebuie să identifici cele mai mici ace dintr-un car de fân gigantic de posibilități. Pentru a face față provocărilor acestui domeniu, care se îndepărtează semnificativ de jocurile tradiționale, am dezvoltat mai multe componente cruciale, inclusiv o nouă arhitectură de rețea neuronală care încorporează părtiniri inductive specifice problemei, o procedură pentru a genera date sintetice utile și o rețetă pentru a valorifica simetriile problemă.

We then trained an AlphaTensor agent using reinforcement learning to play the game, starting without any knowledge about existing matrix multiplication algorithms. Through learning, AlphaTensor gradually improves over time, re-discovering historical fast matrix multiplication algorithms such as Strassen’s, eventually surpassing the realm of human intuition and discovering algorithms faster than previously known.

Single-player game played by AlphaTensor, where the goal is to find a correct matrix multiplication algorithm. The state of the game is a cubic array of numbers (shown as grey for 0, blue for 1, and green for -1), representing the remaining work to be done.

For example, if the traditional algorithm taught in school multiplies a 4×5 by 5×5 matrix using 100 multiplications, and this number was reduced to 80 with human ingenuity, AlphaTensor has found algorithms that do the same operation using just 76 multiplications. 

Algorithm discovered by AlphaTensor using 76 multiplications, an improvement over state-of-the-art algorithms.

Beyond this example, AlphaTensor’s algorithm improves on Strassen’s two-level algorithm in a finite field for the first time since its discovery 50 years ago. These algorithms for multiplying small matrices can be used as primitives to multiply much larger matrices of arbitrary size. 

Moreover, AlphaTensor also discovers a diverse set of algorithms with state-of-the-art complexity – up to thousands of matrix multiplication algorithms for each size, showing that the space of matrix multiplication algorithms is richer than previously thought. 

Algorithms in this rich space have different mathematical and practical properties. Leveraging this diversity, we adapted AlphaTensor to specifically find algorithms that are fast on a given hardware, such as Nvidia V100 GPU, and Google TPU v2. These algorithms multiply large matrices 10-20% faster than the commonly used algorithms on the same hardware, which showcases AlphaTensor’s flexibility in optimising arbitrary objectives.

AlphaTensor with an objective corresponding to the runtime of the algorithm. When a correct matrix multiplication algorithm is discovered, it’s benchmarked on the target hardware, which is then fed back to AlphaTensor, in order to learn more efficient algorithms on the target hardware.

Explorarea impactului asupra cercetărilor și aplicațiilor viitoare

From a mathematical standpoint, our results can guide further research in complexity theory, which aims to determine the fastest algorithms for solving computational problems. By exploring the space of possible algorithms in a more effective way than previous approaches, AlphaTensor helps advance our understanding of the richness of matrix multiplication algorithms. Understanding this space may unlock new results for helping determine the asymptotic complexity of matrix multiplication, one of the most fundamental open problems in computer science

Deoarece multiplicarea matricei este o componentă de bază în multe sarcini de calcul, care cuprinde grafica pe computer, comunicațiile digitale, formarea rețelelor neuronale și calculul științific, algoritmii descoperiți de AlphaTensor ar putea face calculele în aceste domenii semnificativ mai eficiente. Flexibilitatea AlphaTensor de a lua în considerare orice fel de obiectiv ar putea, de asemenea, să stimuleze noi aplicații pentru proiectarea algoritmilor care optimizează metrici, cum ar fi utilizarea energiei și stabilitatea numerică, ajutând la prevenirea erorilor mici de rotunjire din bulgăre de zăpadă pe măsură ce algoritmul funcționează.

While we focused here on the particular problem of matrix multiplication, we hope that our paper will inspire others in using AI to guide algorithmic discovery for other fundamental computational tasks. Our research also shows that AlphaZero is a powerful algorithm that can be extended well beyond the domain of traditional games to help solve open problems in mathematics. Building upon our research, we hope to spur on a greater body of work – applying AI to help society solve some of the most important challenges in mathematics and across the sciences.

spot_img

Ultimele informații

spot_img

Chat cu noi

Bună! Cu ce ​​​​vă pot ajuta?