AlphaDev

AlphaDev
Developer(s)	DeepMind
Type	Reinforcement learning

AlphaDev is an artificial intelligence system developed by Google DeepMind to discover enhanced computer science algorithms using reinforcement learning. AlphaDev is based on AlphaZero, a system that mastered the games of chess, shogi and go by self-play. AlphaDev applies the same approach to finding faster algorithms for fundamental tasks such as sorting and hashing.^[1]^[2]^[3]

Development

On June 7, 2023, Google DeepMind published a paper in Nature introducing AlphaDev, which discovered new algorithms that outperformed the state-of-the-art methods for small sort algorithms.^[1] For example, AlphaDev found a faster assembly language sequence for sorting 5-element sequences.^[4] Upon analysing the algorithms in-depth, AlphaDev discovered two unique sequences of assembly instructions called the AlphaDev swap and copy moves that avoid a single assembly instruction each time they are applied.^[1]^[3] For variable sort algorithms, AlphaDev discovered fundamentally different algorithm structures. For example, for VarSort4 (sort up to 4 elements) AlphaDev discovered an algorithm 29 assembly instructions shorter than the human benchmark.^[1] AlphaDev also improved on the speed of hashing algorithms by up to 30% in certain cases.^[2]

In January 2022, Google DeepMind submitted its new sorting algorithms to the organization that manages C++, one of the most popular programming languages in the world, and after independent vetting, AlphaDev's algorithms were added to the library.^[5] This was the first change to the C++ Standard Library sorting algorithms in more than a decade and the first update to involve an algorithm discovered using AI.^[5] In January 2023, DeepMind also added its hashing algorithm for inputs from 9 to 16 bytes to Abseil, an open-source collection of prewritten C++ algorithms that can be used by anyone coding with C++.^[6]^[5] Google estimates that these two algorithms are used trillions of times every day.^[7]

Design

AlphaDev is built on top of AlphaZero, the reinforcement-learning model that DeepMind trained to master games such as Go and chess.^[5] The company's breakthrough was to treat the problem of finding a faster algorithm as a game and then train its AI to win it.^[2] AlphaDev plays a single-player game where the objective is to iteratively build an algorithm in the assembly language that is both fast and correct.^[1] AlphaDev uses a neural network to guide its search for optimal moves, and learns from its own experience and synthetic demonstrations.^[1]

AlphaDev showcases the potential of AI to advance the foundations of computing and optimize code for different criteria. Google DeepMind hopes that AlphaDev will inspire further research on using AI to discover new algorithms and improve existing ones.^[2]

Algorithm

The primary learning algorithm in AlphaDev is an extension of AlphaZero.

Encoding assembly programming into a game

In order to use AlphaZero on assembly programming, the authors created a Transformer-based vector representation of assembly programs designed to capture their underlying structure.^[1] This finite representation allows a neural network to play assembly programming like a game with finitely many possible moves (like Go),

The representation uses the following components:

A Transformer network, to encode assembly opcodes are converted to one-hot encodings and concatenated to form the raw input sequence.
A multilayer perceptron network, which encodes the "CPU state", that is, the states of each register and memory location for a given set of inputs,

Playing the game

The game state is the assembly program generated up to a given point.

The game move is an extra instruction appended to the current assembly program.

The game's reward is a function of the assembly program's correctness and latency. To reduce cost, AlphaDev only computes actual measured latency on less than 0.002% of generated programs, as it does not evaluate latency during the search process. Instead, it uses two functions that estimate the correctness and latency by being trained via supervised learning using the real measured correctness and latency values.

Result

Hashing

AlphaDev developed hashing algorithms for inputs from 9 to 16 bytes to Abseil, an open-source collection of prewritten C++ algorithms.^[8]

LLVM standard sorting library

AlphaDev discovered new sorting algorithms, which led to up to 70% improvements in the LLVM libc++ sorting library for shorter sequences and about 1.7% improvements for sequences exceeding 250,000 elements. These improvements apply to the uint32, uint64 and float data types for ARMv8, Intel Skylake and AMD Zen 2 CPU architectures. AlphaDev's branchless conditional assembly and new swap move contributed to these performance improvements. The discovered algorithms were reverse-engineered from low-level assembly to C++, and have officially been included in the libc++ standard sorting library.^[6]

Improved deserialization in protobuf

AlphaDev learned an optimized VarInt deserialization function in protobuf,^[9] outperforming the human benchmark for single valued inputs by approximately three times in terms of speed. AlphaDev also discovered a new VarInt assignment move, combining two operations into a single instruction for latency savings.

Comparison with logical AI approach

The AlphaDev's performance was compared to stochastic superoptimization,^[10] a logical AI approach. The latter was run with at least the same amount of resources and wall-clock time as AlphaDev. The results showed that AlphaDev-S requires a prohibitive amount of time to optimize directly for latency, as latency needs to be computed after every mutation. As such, AlphaDev-S optimizes for a latency proxy, specifically algorithm length, and, then, at the end of training, all correct programs generated by AlphaDev-S are searched through.

References

^ ^a ^b ^c ^d ^e ^f ^g Mankowitz, Daniel J.; Michi, Andrea; Zhernov, Anton; Gelmi, Marco; Selvi, Marco; Paduraru, Cosmin; Leurent, Edouard; Iqbal, Shariq; Lespiau, Jean-Baptiste; Ahern, Alex; Koppe, Thomas; Millikin, Kevin; Gaffney, Stephen; Elster, Sophie; Broshear, Jackson; Gamble, Chris; Milan, Kieran; Tung, Robert; Hwang, Minjae; Cemgil, Taylan; Barekatain, Mohammadamin; Li, Yujia; Mandhane, Amol; Hubert, Thomas; Schrittwieser, Julian; Hassabis, Demis; Kohli, Pushmeet; Riedmiller, Martin; Vinyals, Oriol; Silver, David (2023). "Faster sorting algorithms discovered using deep reinforcement learning". Nature. 618 (7964): 257–263. Bibcode:2023Natur.618..257M. doi:10.1038/s41586-023-06004-9. PMC 10247365. PMID 37286649.
^ ^a ^b ^c ^d "AlphaDev discovers faster sorting algorithms". Blog. Google DeepMind. June 7, 2023. Archived from the original on 2023-06-20. Retrieved 2023-06-20.
^ ^a ^b Tunney, Justine (2023-06-20). "Understanding DeepMind's Sorting Algorithm". justine.lol. Archived from the original on 2023-06-18. Retrieved 2023-06-20.
^ Github - AlphaDev, DeepMind, 2023-06-21, retrieved 2023-06-21
^ ^a ^b ^c ^d Heaven, Will Douglas (June 7, 2023). "Google DeepMind's game-playing AI just found another way to make code faster". MIT Technology Review. Archived from the original on 2023-06-14. Retrieved 2023-06-20.
^ ^a ^b "⚙ D118029 Introduce branchless sorting functions for sort3, sort4 and sort5". reviews.llvm.org. Retrieved 2023-06-21.
^ Sparkes, Matthew (7 June 2023). "DeepMind AI's new way to sort objects could speed up global computing". New Scientist. Retrieved 2024-06-20.
^ "Replace absl::Hash for inputs from 9 to 16 bytes according to AlphaZero findings by Abseil Team · abseil/abseil-cpp@74eee2a". GitHub. Retrieved 2023-06-24.
^ "VarInt protocol buffer serialization and deserialization". protobuf.dev. Retrieved 2023-06-24.
^ Schkufza, Eric; Sharma, Rahul; Aiken, Alex (2013-03-16). "Stochastic superoptimization". ACM SIGARCH Computer Architecture News. 41 (1): 305–316. arXiv:1211.0557. doi:10.1145/2490301.2451150. ISSN 0163-5964.

External links

[nature-1] ^ ^a ^b ^c ^d ^e ^f ^g Mankowitz, Daniel J.; Michi, Andrea; Zhernov, Anton; Gelmi, Marco; Selvi, Marco; Paduraru, Cosmin; Leurent, Edouard; Iqbal, Shariq; Lespiau, Jean-Baptiste; Ahern, Alex; Koppe, Thomas; Millikin, Kevin; Gaffney, Stephen; Elster, Sophie; Broshear, Jackson; Gamble, Chris; Milan, Kieran; Tung, Robert; Hwang, Minjae; Cemgil, Taylan; Barekatain, Mohammadamin; Li, Yujia; Mandhane, Amol; Hubert, Thomas; Schrittwieser, Julian; Hassabis, Demis; Kohli, Pushmeet; Riedmiller, Martin; Vinyals, Oriol; Silver, David (2023). "Faster sorting algorithms discovered using deep reinforcement learning". Nature. 618 (7964): 257–263. Bibcode:2023Natur.618..257M. doi:10.1038/s41586-023-06004-9. PMC 10247365. PMID 37286649.

[deepmind-2] "AlphaDev discovers faster sorting algorithms". Blog. Google DeepMind. June 7, 2023. Archived from the original on 2023-06-20. Retrieved 2023-06-20.

[understanding-3] Tunney, Justine (2023-06-20). "Understanding DeepMind's Sorting Algorithm". justine.lol. Archived from the original on 2023-06-18. Retrieved 2023-06-20.

[4] Github - AlphaDev, DeepMind, 2023-06-21, retrieved 2023-06-21

[mit-5] Heaven, Will Douglas (June 7, 2023). "Google DeepMind's game-playing AI just found another way to make code faster". MIT Technology Review. Archived from the original on 2023-06-14. Retrieved 2023-06-20.

[:0-6] "⚙ D118029 Introduce branchless sorting functions for sort3, sort4 and sort5". reviews.llvm.org. Retrieved 2023-06-21.

[7] Sparkes, Matthew (7 June 2023). "DeepMind AI's new way to sort objects could speed up global computing". New Scientist. Retrieved 2024-06-20.

[8] "Replace absl::Hash for inputs from 9 to 16 bytes according to AlphaZero findings by Abseil Team · abseil/abseil-cpp@74eee2a". GitHub. Retrieved 2023-06-24.

[9] "VarInt protocol buffer serialization and deserialization". protobuf.dev. Retrieved 2023-06-24.

[10] Schkufza, Eric; Sharma, Rahul; Aiken, Alex (2013-03-16). "Stochastic superoptimization". ACM SIGARCH Computer Architecture News. 41 (1): 305–316. arXiv:1211.0557. doi:10.1145/2490301.2451150. ISSN 0163-5964.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]