AlphaZero

Technology

A more general version of AlphaGo that learned to play games like Go, chess, and shogi from scratch through self-play, without human data.

First Mentioned

9/13/2025, 5:47:56 AM

Last Updated

9/13/2025, 5:53:34 AM

Research Retrieved

9/13/2025, 5:53:34 AM

Summary

AlphaZero is an artificial intelligence program developed by DeepMind that achieved superhuman performance in chess, shogi, and Go. It was trained solely through self-play using thousands of Tensor Processing Units (TPUs) without access to external game data like opening books or endgame tables. Within 24 hours of training, AlphaZero demonstrated a superior level of play, defeating world-champion programs such as Stockfish in chess. The algorithm's development was detailed in a preprint released on December 5, 2017, and later published in the journal *Science* on December 7, 2018. AlphaZero's success is seen as a key example of hybrid models combining neural networks with deterministic models based on known physics. While the program itself has not been publicly released, its advancements have paved the way for subsequent developments like MuZero, which can generalize AlphaZero's capabilities to games without prior knowledge of their rules.

Referenced in 1 Document

Google DeepMind CEO Demis Hassabis on AI, Creativity, and a Golden Age of Science | All-In Summit

Research Data

Extracted Attributes

Type
Artificial Intelligence program
Developer
DeepMind
Model Type
Hybrid Model (combining Neural Networks and Deterministic Models)
Training Data
None (trained from scratch, no opening books or endgame tables)
Algorithm Type
Generic reinforcement learning algorithm
Games Mastered
Chess, Shogi, Go
Inception Date
2017-01-01
Training Method
Self-play
Training Hardware
5,000 first-generation TPUs (for game generation), 64 second-generation TPUs (for neural network training)
Execution Hardware
Single machine with four TPUs
Performance Metric
Superhuman level of play
Publication Status
Not publicly released

Timeline

Inception of AlphaZero. (Source: Wikidata)
2017-01-01
DeepMind team released a preprint introducing AlphaZero. (Source: Summary, DBPedia, Web Search)
2017-12-05
Within 24 hours of training, AlphaZero achieved a superhuman level of play in chess, shogi, and Go, defeating world-champion programs. (Source: Summary, DBPedia, Web Search)
2017-12-05
After four hours of training, AlphaZero was estimated to be playing chess at a higher Elo rating than Stockfish 8. (Source: DBPedia, Web Search)
2017-12-05
After nine hours of training, AlphaZero defeated Stockfish 8 in a 100-game tournament (28 wins, 0 losses, 72 draws). (Source: DBPedia, Web Search)
2017-12-05
DeepMind's paper on AlphaZero was published in the journal *Science*. (Source: Summary, DBPedia, Web Search)
2018-12-07
DeepMind published a new paper detailing MuZero, an algorithm able to generalize AlphaZero's work. (Source: DBPedia)
2019

Web Search Results

AlphaZero - Wikipedia
AlphaZero is a computer program developed by artificial intelligence research company DeepMind to master the games of chess, shogi and go "Go (game)"). This algorithm uses an approach similar to AlphaGo Zero. [...] DeepMind stated in its preprint, "The game of chess represented the pinnacle of AI research over several decades. State-of-the-art programs are based on powerful engines that search many millions of positions, leveraging handcrafted domain expertise and sophisticated domain adaptations. AlphaZero is a generic reinforcement learning algorithm – originally devised for the game of go – that achieved superior results within a few hours, searching a thousand times fewer positions, given no domain [...] On December 5, 2017, the DeepMind team released a preprint paper introducing AlphaZero, which would soon play three games by defeating world-champion chess engines Stockfish "Stockfish (chess)"), Elmo "Elmo (shogi engine)"), and the three-day version of AlphaGo Zero. In each case it made use of custom tensor processing units (TPUs) that the Google programs were optimized to use. AlphaZero was trained solely via self-play "Self-play (reinforcement learning technique)") using 5,000
AlphaZero Chess: How It Works, What Sets It Apart, and ... - Medium
In short, AlphaZero is a game-playing program that, through a combination of self-play and neural network reinforcement learning (more on that later), is able to learn to play games such as chess and Go from scratch ─ that is, after being fed nothing more than the rules of said games. In fact, a newer derivative of AlphaZero, called MuZero, isn’t limited to only board games such as chess, but can also learn to play a range of simple video games from the Atari collection. Both AlphaZero and [...] ## Introduction To those of you who have an interest in chess ─ or who have been monitoring recent developments in artificial intelligence ─ the name “AlphaZero” will be instantly recognisable; its victory over the then-leading chess engine in the world, Stockfish, had revolutionised the way that chess is played by both computers and, indeed, humans. [...] So, in the last section, we saw how chess engines had been, at their core, practically the same for decades, with improvements, significant though they were, being largely in degree ─ not kind. That all changed in 2018 when DeepMind unveiled the inner workings of AlphaZero, which had previously shocked the chess world with an impressive showing against the strongest engine at the time. Soon after the 2018 paper, an open-source AlphaZero clone project called LeelaChessZero was initiated, and the
AlphaZero - Chess Engines
AlphaZero was developed by the artificial intelligence and research company DeepMind, which was acquired by Google. It is a computer program that reached a virtually unthinkable level of play using only reinforcement learning and self-play in order to train its neural networks. In other words, it was only given the rules of the game and then played against itself many millions of times (44 million games in the first nine hours, according to DeepMind). [...] # AlphaZero English‎ Bahasa Indonesia Deutsch English Français Italiano Tϋrkçe Русский العربية 한국어 In 2017 the chess world was shaken to its core when Stockfish (the world's strongest chess engine) was defeated in a one-sided match. It was not defeated by a human but by an unknown computer program that seemed to be otherworldly—AlphaZero. Let's learn more about this powerful chess entity. Here is what you need to know about AlphaZero: [...] AlphaZero uses its neural networks to make extremely advanced evaluations of positions, which negates the need to look at over 70 million positions per second (like Stockfish does). According to DeepMind, AlphaZero reached the benchmarks necessary to defeat Stockfish in a mere four hours. AlphaZero runs on custom hardware that some have referred to as a "Google Supercomputer"—although DeepMind has since clarified that AlphaZero ran on four tensor processing units (TPUs) in its matches.
AlphaZero: Shedding new light on chess, shogi, and Go
In late 2017 we introduced AlphaZero, a single system that taught itself from scratch how to master the games of chess, shogi(Japanese chess), and Go), beating a world-champion program in each case. We were excited by the preliminary results and thrilled to see the response from members of the chess community, who saw in AlphaZero’s games a ground-breaking, highly dynamic and “unconventional” style of play that differed from any chess playing engine that came before it. [...] Today, we are delighted to introduce the full evaluation of AlphaZero, published in the journal Science (Open Access version here), that confirms and updates those preliminary results. It describes how AlphaZero quickly learns each game to become the strongest player in history for each, despite starting its training from random play, with no in-built domain knowledge but the basic rules of the game. [...] As with Go, we are excited about AlphaZero’s creative response to chess, which has been a grand challenge for artificial intelligence since the dawn of the computing age with early pioneers including Babbage, Turing, Shannon, and von Neumann all trying their hand at designing chess programs. But AlphaZero is about more than chess, shogi or Go. To create intelligent systems capable of solving a wide range of real-world problems we need them to be flexible and generalise to new situations. While
How the Artificial Intelligence Program AlphaZero Mastered Its Games
The program, called AlphaZero, descends from AlphaGo, an A.I. that became known for defeating Lee Sedol, the world’s best Go player, in March of 2016. Sedol’s defeat was a stunning upset. In “AlphaGo,” a documentary released earlier this year on Netflix, the filmmakers follow both the team that developed the A.I. and its human opponents, who have devoted their lives to the game. We watch as these humans experience the stages of a new kind of grief. At first, they don’t see how they can lose to [...] generalized to any two-person, zero-sum game of perfect information (that is, a game in which there are no hidden elements, such as face-down cards in poker). DeepMind dropped the “Go” from the name and christened its new system AlphaZero. At its core was an algorithm so powerful that you could give it the rules of humanity’s richest and most studied games and, later that day, it would become the best player there has ever been. Perhaps more surprising, this iteration of the system was also by [...] exercises a different kind of intelligence than the one we care about most. Played in this way, chess might be more like earth-moving than we thought: an activity that, in the end, isn’t our forté, and so shouldn’t be all that dear to our souls. To learn, AlphaZero needs to play millions more games than a human does— but, when it’s done, it plays like a genius. It relies on churning faster than a person ever could through a deep search tree, then uses a neural network to process what it finds

Wikidata

View on Wikidata

Instance Of
Q40056
Inception Date
1/1/2017

DBPedia

View on DBPedia

AlphaZero is a computer program developed by artificial intelligence research company DeepMind to master the games of chess, shogi and go. This algorithm uses an approach similar to AlphaGo Zero. On December 5, 2017, the DeepMind team released a preprint introducing AlphaZero, which within 24 hours of training achieved a superhuman level of play in these three games by defeating world-champion programs Stockfish, elmo, and the three-day version of AlphaGo Zero. In each case it made use of custom tensor processing units (TPUs) that the Google programs were optimized to use. AlphaZero was trained solely via self-play using 5,000 first-generation TPUs to generate the games and 64 second-generation TPUs to train the neural networks, all in parallel, with no access to opening books or endgame tables. After four hours of training, DeepMind estimated AlphaZero was playing chess at a higher Elo rating than Stockfish 8; after nine hours of training, the algorithm defeated Stockfish 8 in a time-controlled 100-game tournament (28 wins, 0 losses, and 72 draws). The trained algorithm played on a single machine with four TPUs. DeepMind's paper on AlphaZero was published in the journal Science on 7 December 2018. However, the AlphaZero program itself has not been made available to the public. In 2019 DeepMind published a new paper detailing MuZero, a new algorithm able to generalise AlphaZero's work, playing both Atari and board games without knowledge of the rules or representations of the game.

AlphaZero

First Mentioned

Last Updated

Research Retrieved

Summary

Referenced in 1 Document

Research Data

Extracted Attributes

Type

Developer

Model Type

Training Data

Algorithm Type

Games Mastered

Inception Date

Training Method

Training Hardware

Execution Hardware

Performance Metric

Publication Status

Timeline

Web Search Results

Wikidata

Instance Of

Inception Date

DBPedia