Rich Sutton

Person

A pioneering computer scientist in reinforcement learning and the author of the influential 2019 essay 'The Bitter Lesson'.

First Mentioned

7/12/2025, 4:40:58 AM

Last Updated

7/12/2025, 5:01:27 AM

Research Retrieved

7/12/2025, 5:01:27 AM

Summary

Richard Stuart Sutton, born in 1957 or 1958, is a Canadian computer scientist and a prominent figure in computational reinforcement learning. He holds a professorship in computing science at the University of Alberta, serves as a fellow and Chief Scientific Advisor at the Alberta Machine Intelligence Institute, and is a research scientist at Keen Technologies. Sutton is recognized as a founder of modern computational reinforcement learning, with significant contributions including temporal difference learning and policy gradient methods. His essay, "The Bitter Lesson," posits that scalable computation ultimately triumphs over systems reliant on human-labeled data, a principle that has been applied to discussions about AI development, such as the success of Elon Musk's xAI and the strategies of Tesla's FSD.

Referenced in 1 Document

Document 5a2e1075...

Research Data

Extracted Attributes

Born
1957 or 1958
Award
Outstanding Achievement in Research award from the University of Massachusetts Amherst (2013)
Full Name
Richard Stuart Sutton
Fellowship
Association for the Advancement of Artificial Intelligence (AAAI) Fellow
Occupation
Computer Scientist
Nationality
Canadian
Notable Work
"The Bitter Lesson" essay
Current Position
Research Scientist at Keen Technologies
Key Contribution
Founder of modern computational reinforcement learning
Notable Contributions
Temporal difference learning, Policy gradient methods

Timeline

Born. (Source: Summary)
1957 or 1958
Developed Klopf's ideas further, particularly links to animal learning theories, describing learning rules driven by changes in temporally successive predictions. (Source: Web Search)
1978
Co-developed a psychological model of classical conditioning based on temporal-difference learning with Andrew G. Barto. (Source: Web Search)
1981
Co-developed a psychological model of classical conditioning based on temporal-difference learning with Andrew G. Barto. (Source: Web Search)
1982
Co-developed a method for using temporal-difference learning in trial-and-error learning (actor–critic architecture) and applied it to Michie and Chambers’s pole-balancing problem with Andrew G. Barto and Charles W. Anderson. (Source: Web Search)
1983
His Ph.D. dissertation extensively studied the actor-critic method. (Source: Web Search)
1984
Separated temporal-difference learning from control, treating it as a general prediction method, and introduced the TD(λ) algorithm. (Source: Web Search)
1988
Became a fellow of the Association for the Advancement of Artificial Intelligence (AAAI). (Source: Web Search)
2001
Received the President's Award from the International Neural Network Society. (Source: Web Search)
2003
Received the Outstanding Achievement in Research award from the University of Massachusetts Amherst. (Source: Web Search)
2013
The second edition of his book "Reinforcement Learning: An Introduction" (co-authored with Andrew G. Barto) was published. (Source: Web Search)
2018

Wikipedia

View on Wikipedia

Richard S. Sutton

Richard Stuart Sutton (born 1957 or 1958) is a Canadian computer scientist. He is a professor of computing science at the University of Alberta, fellow & Chief Scientific Advisor at the Alberta Machine Intelligence Institute, and a research scientist at Keen Technologies. Sutton is considered one of the founders of modern computational reinforcement learning, having several significant contributions to the field, including temporal difference learning and policy gradient methods.

Web Search Results

Richard S. Sutton - Wikipedia
Richard Stuart Sutton( (born 1957 or 1958) is a Canadian computer scientist. He is a professor of computing science at the University of Alberta, fellow & Chief Scientific Advisor at the Alberta Machine Intelligence Institute, and a research scientist at Keen Technologies.( Sutton is considered one of the founders of modern computational reinforcement learning,( having several significant contributions to the field, including temporal difference learning and policy gradient methods.( [...] Sutton is a fellow of the Association for the Advancement of Artificial Intelligence (AAAI) since 2001;( his nomination read: "For significant contributions to many topics in machine learning, including reinforcement learning, temporal difference techniques, and neural networks."( In 2003, he received the President's Award from the International Neural Network Society( and in 2013, the Outstanding Achievement in Research award from the University of Massachusetts Amherst.( In 2025, he received [...] Canadian computer scientist (born 1957/58)
Rich Sutton's Home Page
Richard S. Sutton, Research Scientist, Keen Technologies, Professor, Department of Computing Science, University of Alberta, Principal Investigator,
The man who taught AI to learn believes human-level intelligence is ...
. RLHF allows AI models to refine their responses based on user interactions, making them more conversational and aligned with human expectations. Despite these advancements, Sutton believes reinforcement learning has yet to be fully utilized. “It’s still early,” he said. “AI systems today mostly rely on pre-processed data, not real-world interactions. That needs to change if we want AI that truly understands and adapts.” [...] For example, an AI assistant might be able to generate a response to a single question well but struggle with maintaining a logical conversation over multiple interactions or planning a complex task that unfolds over time—like booking a vacation that involves coordinating flights, hotels and activities. Sutton believes that reinforcement learning and better long-term reasoning algorithms will be key to overcoming this limitation. Image 1: 3D design of balls rolling on a track [...] The analogy raises profound questions. If AI becomes more autonomous, how will society integrate these digital beings? Will they have rights? Should they be given independence? Sutton suggests that the way we approach AI’s development now will define how these future relationships unfold. “If we raise AI in an environment of trust and cooperation, they will learn to exist alongside us. If we treat them as adversaries, we risk creating systems that have every reason to resist us,” he said.
[PDF] Reinforcement Learning: An Introduction - Stanford University
explicitly into his classiﬁer systems. A key step was taken by Sutton in 1988 by separating temporal-diﬀerence learning from control, treating it as a general prediction method. That paper also in-troduced the TD(λ) algorithm and proved some of its convergence properties. [...] Sutton (1978a, 1978b, 1978c) developed Klopf’s ideas further, particu-larly the links to animal learning theories, describing learning rules driven by changes in temporally successive predictions. He and Barto reﬁned these ideas and developed a psychological model of classical conditioning based on temporal-diﬀerence learning (Sutton and Barto, 1981a; Barto and Sutton, 1982). There followed several other inﬂuential psychological models of classical conditioning based on temporal-diﬀerence [...] At this time we developed a method for using temporal-diﬀerence learning in trial-and-error learning, known as the actor–critic architecture, and applied this method to Michie and Chambers’s pole-balancing problem (Barto, Sutton, and Anderson, 1983). This method was extensively studied in Sutton’s (1984) Ph.D. dissertation and extended to use backpropagation neural networks in Anderson’s (1986) Ph.D. dissertation. Around this time, Holland (1986) incor-porated temporal-diﬀerence ideas
Barto Book: Reinforcement Learning: An Introduction - Sutton
# Reinforcement Learning: An Introduction # Small book cover Small book cover ## Richard S. Sutton and Andrew G. Barto ### Second Edition (see here for the first edition) MIT Press, Cambridge, MA, 2018 [...] Buy from Amazon Errata and Notes Full Pdf Without Margins Code Solutions -- send in your solutions for a chapter, get the official ones back (currently incomplete) Slides and Other Teaching Aids Links to pdfs of the literature sources cited in the book (Many thanks to Daniel Plop!) Latex Notation -- Want to use the book's notation in your own work? Download this .sty file and this example of its use

DBPedia

View on DBPedia

Location Data

Rich Road, Sutton, Worcester County, Massachusetts, United States

residential

Coordinates: 42.1262161, -71.7906852

Open Map

Rich Sutton

First Mentioned

Last Updated

Research Retrieved

Summary

Referenced in 1 Document

Research Data

Extracted Attributes

Born

Award

Full Name

Fellowship

Occupation

Nationality

Notable Work

Current Position

Key Contribution

Notable Contributions