Rich Sutton

Person

A pioneering computer scientist in reinforcement learning and the author of the influential 2019 essay 'The Bitter Lesson'.


entitydetail.created_at

7/12/2025, 4:40:58 AM

entitydetail.last_updated

7/12/2025, 5:01:27 AM

entitydetail.research_retrieved

7/12/2025, 5:01:27 AM

Summary

Richard Stuart Sutton, born in 1957 or 1958, is a Canadian computer scientist and a prominent figure in computational reinforcement learning. He holds a professorship in computing science at the University of Alberta, serves as a fellow and Chief Scientific Advisor at the Alberta Machine Intelligence Institute, and is a research scientist at Keen Technologies. Sutton is recognized as a founder of modern computational reinforcement learning, with significant contributions including temporal difference learning and policy gradient methods. His essay, "The Bitter Lesson," posits that scalable computation ultimately triumphs over systems reliant on human-labeled data, a principle that has been applied to discussions about AI development, such as the success of Elon Musk's xAI and the strategies of Tesla's FSD.

Referenced in 1 Document
Research Data
Extracted Attributes
  • Born

    1957 or 1958

  • Award

    Outstanding Achievement in Research award from the University of Massachusetts Amherst (2013)

  • Full Name

    Richard Stuart Sutton

  • Fellowship

    Association for the Advancement of Artificial Intelligence (AAAI) Fellow

  • Occupation

    Computer Scientist

  • Nationality

    Canadian

  • Notable Work

    "The Bitter Lesson" essay

  • Current Position

    Research Scientist at Keen Technologies

  • Key Contribution

    Founder of modern computational reinforcement learning

  • Notable Contributions

    Temporal difference learning, Policy gradient methods

Timeline
  • Born. (Source: Summary)

    1957 or 1958

  • Developed Klopf's ideas further, particularly links to animal learning theories, describing learning rules driven by changes in temporally successive predictions. (Source: Web Search)

    1978

  • Co-developed a psychological model of classical conditioning based on temporal-difference learning with Andrew G. Barto. (Source: Web Search)

    1981

  • Co-developed a psychological model of classical conditioning based on temporal-difference learning with Andrew G. Barto. (Source: Web Search)

    1982

  • Co-developed a method for using temporal-difference learning in trial-and-error learning (actor–critic architecture) and applied it to Michie and Chambers’s pole-balancing problem with Andrew G. Barto and Charles W. Anderson. (Source: Web Search)

    1983

  • His Ph.D. dissertation extensively studied the actor-critic method. (Source: Web Search)

    1984

  • Separated temporal-difference learning from control, treating it as a general prediction method, and introduced the TD(λ) algorithm. (Source: Web Search)

    1988

  • Became a fellow of the Association for the Advancement of Artificial Intelligence (AAAI). (Source: Web Search)

    2001

  • Received the President's Award from the International Neural Network Society. (Source: Web Search)

    2003

  • Received the Outstanding Achievement in Research award from the University of Massachusetts Amherst. (Source: Web Search)

    2013

  • The second edition of his book "Reinforcement Learning: An Introduction" (co-authored with Andrew G. Barto) was published. (Source: Web Search)

    2018

Richard S. Sutton

Richard Stuart Sutton (born 1957 or 1958) is a Canadian computer scientist. He is a professor of computing science at the University of Alberta, fellow & Chief Scientific Advisor at the Alberta Machine Intelligence Institute, and a research scientist at Keen Technologies. Sutton is considered one of the founders of modern computational reinforcement learning, having several significant contributions to the field, including temporal difference learning and policy gradient methods.

Web Search Results
  • Richard S. Sutton - Wikipedia

    Richard Stuart Sutton( (born 1957 or 1958) is a Canadian computer scientist. He is a professor of computing science at the University of Alberta, fellow & Chief Scientific Advisor at the Alberta Machine Intelligence Institute, and a research scientist at Keen Technologies.( Sutton is considered one of the founders of modern computational reinforcement learning,( having several significant contributions to the field, including temporal difference learning and policy gradient methods.( [...] Sutton is a fellow of the Association for the Advancement of Artificial Intelligence (AAAI) since 2001;( his nomination read: "For significant contributions to many topics in machine learning, including reinforcement learning, temporal difference techniques, and neural networks."( In 2003, he received the President's Award from the International Neural Network Society( and in 2013, the Outstanding Achievement in Research award from the University of Massachusetts Amherst.( In 2025, he received [...] Canadian computer scientist (born 1957/58)

  • Rich Sutton's Home Page

    Richard S. Sutton, Research Scientist, Keen Technologies, Professor, Department of Computing Science, University of Alberta, Principal Investigator,

  • The man who taught AI to learn believes human-level intelligence is ...

    . RLHF allows AI models to refine their responses based on user interactions, making them more conversational and aligned with human expectations. Despite these advancements, Sutton believes reinforcement learning has yet to be fully utilized. “It’s still early,” he said. “AI systems today mostly rely on pre-processed data, not real-world interactions. That needs to change if we want AI that truly understands and adapts.” [...] For example, an AI assistant might be able to generate a response to a single question well but struggle with maintaining a logical conversation over multiple interactions or planning a complex task that unfolds over time—like booking a vacation that involves coordinating flights, hotels and activities. Sutton believes that reinforcement learning and better long-term reasoning algorithms will be key to overcoming this limitation. Image 1: 3D design of balls rolling on a track [...] The analogy raises profound questions. If AI becomes more autonomous, how will society integrate these digital beings? Will they have rights? Should they be given independence? Sutton suggests that the way we approach AI’s development now will define how these future relationships unfold. “If we raise AI in an environment of trust and cooperation, they will learn to exist alongside us. If we treat them as adversaries, we risk creating systems that have every reason to resist us,” he said.

  • [PDF] Reinforcement Learning: An Introduction - Stanford University

    explicitly into his classifier systems. A key step was taken by Sutton in 1988 by separating temporal-difference learning from control, treating it as a general prediction method. That paper also in-troduced the TD(λ) algorithm and proved some of its convergence properties. [...] Sutton (1978a, 1978b, 1978c) developed Klopf’s ideas further, particu-larly the links to animal learning theories, describing learning rules driven by changes in temporally successive predictions. He and Barto refined these ideas and developed a psychological model of classical conditioning based on temporal-difference learning (Sutton and Barto, 1981a; Barto and Sutton, 1982). There followed several other influential psychological models of classical conditioning based on temporal-difference [...] At this time we developed a method for using temporal-difference learning in trial-and-error learning, known as the actor–critic architecture, and applied this method to Michie and Chambers’s pole-balancing problem (Barto, Sutton, and Anderson, 1983). This method was extensively studied in Sutton’s (1984) Ph.D. dissertation and extended to use backpropagation neural networks in Anderson’s (1986) Ph.D. dissertation. Around this time, Holland (1986) incor-porated temporal-difference ideas

  • Barto Book: Reinforcement Learning: An Introduction - Sutton

    # Reinforcement Learning: An Introduction # Small book cover Small book cover ## Richard S. Sutton and Andrew G. Barto ### Second Edition (see here for the first edition) MIT Press, Cambridge, MA, 2018 [...] Buy from Amazon Errata and Notes Full Pdf Without Margins Code Solutions -- send in your solutions for a chapter, get the official ones back (currently incomplete) Slides and Other Teaching Aids Links to pdfs of the literature sources cited in the book (Many thanks to Daniel Plop!) Latex Notation -- Want to use the book's notation in your own work? Download this .sty file and this example of its use

Location Data

Rich Road, Sutton, Worcester County, Massachusetts, United States

residential

Coordinates: 42.1262161, -71.7906852

Open Map