Diffusion Models
A class of generative models used in AI for creating images and video. Sam Altman notes that OpenAI's best image and video models, like Sora, are diffusion models.
First Mentioned
10/12/2025, 6:49:24 AM
Last Updated
10/12/2025, 6:53:39 AM
Research Retrieved
10/12/2025, 6:53:39 AM
Summary
Diffusion models, also known as diffusion-based generative models or score-based generative models, represent a class of latent variable generative models in machine learning. They operate by learning a diffusion process to generate new data elements that are statistically similar to a given dataset. The core mechanism involves a forward diffusion process, where Gaussian noise is gradually added to data, and a reverse sampling process, where a neural network (often a U-net or transformer) is trained to denoise the data. Primarily applied in computer vision for tasks such as image generation, denoising, inpainting, and super-resolution, diffusion models have also found utility in natural language processing, sound generation, and reinforcement learning. Commercial successes like Stable Diffusion and DALL-E leverage these models, often combining them with text encoders for text-conditioned generation.
Referenced in 1 Document
Research Data
Extracted Attributes
Category
Generative AI models
Mechanism
Gradually adds Gaussian noise in forward process, learns to remove noise in reverse process
Also known as
Diffusion-based generative models, Score-based generative models
Core Components
Forward diffusion process, Reverse sampling process
Training Method
Variational inference
NLP Applications
Text generation, Summarization
Other Application Fields
Natural Language Processing, Sound generation, Reinforcement learning
Primary Application Field
Computer Vision
Computer Vision Applications
Image denoising, Inpainting, Super-resolution, Image generation, Video generation
Typical Neural Network Backbone
U-nets, Transformers
Underlying Concept (loosely based on)
Non-equilibrium thermodynamics
Advantages (compared to traditional generative models)
Better image quality, interpretable latent space, robustness to overfitting
Timeline
- Diffusion models were introduced as a method to train a model for sampling from complex probability distributions, utilizing techniques from non-equilibrium thermodynamics. (Source: Wikipedia, Web Search)
2015
- As of this year, diffusion models are mainly used for computer vision tasks. (Source: Wikipedia)
2024
Wikipedia
View on WikipediaDiffusion model
In machine learning, diffusion models, also known as diffusion-based generative models or score-based generative models, are a class of latent variable generative models. A diffusion model consists of two major components: the forward diffusion process, and the reverse sampling process. The goal of diffusion models is to learn a diffusion process for a given dataset, such that the process can generate new elements that are distributed similarly as the original dataset. A diffusion model models data as generated by a diffusion process, whereby a new datum performs a random walk with drift through the space of all possible data. A trained diffusion model can be sampled in many ways, with different efficiency and quality. There are various equivalent formalisms, including Markov chains, denoising diffusion probabilistic models, noise conditioned score networks, and stochastic differential equations. They are typically trained using variational inference. The model responsible for denoising is typically called its "backbone". The backbone may be of any kind, but they are typically U-nets or transformers. As of 2024, diffusion models are mainly used for computer vision tasks, including image denoising, inpainting, super-resolution, image generation, and video generation. These typically involve training a neural network to sequentially denoise images blurred with Gaussian noise. The model is trained to reverse the process of adding noise to an image. After training to convergence, it can be used for image generation by starting with an image composed of random noise, and applying the network iteratively to denoise the image. Diffusion-based image generators have seen widespread commercial interest, such as Stable Diffusion and DALL-E. These models typically combine diffusion models with other models, such as text-encoders and cross-attention modules to allow text-conditioned generation. Other than computer vision, diffusion models have also found applications in natural language processing such as text generation and summarization, sound generation, and reinforcement learning.
Web Search Results
- A Very Short Introduction to Diffusion Models | by Kailash Ahirwar
### What are Diffusion Models? Diffusion models are a class of generative AI models that generate high-resolution images of varying quality. They work by gradually adding Gaussian noise to the original data in the forward diffusion process and then learning to remove the noise in the reverse diffusion process. They are latent variable models referring to a hidden continuous feature space, look similar to VAEs(Variational Autoencoders), and are loosely based on non-equilibrium thermodynamics.
- Diffusion model - Wikipedia
In machine learning, diffusion models, also known as diffusion-based generative models or score-based generative models, are a class of latent variablegenerative models. A diffusion model consists of two major components: the forward diffusion process, and the reverse sampling process. The goal of diffusion models is to learn a diffusion process for a given dataset, such that the process can generate new elements that are distributed similarly as the original dataset. A diffusion model models [...] As of 2024( diffusion models are mainly used for computer vision tasks, including image denoising, inpainting, super-resolution, image generation, and video generation. These typically involve training a neural network to sequentially denoise images blurred with Gaussian noise.( The model is trained to reverse the process of adding noise to an image. After training to convergence, it can be used for image generation by starting with an image composed of random noise, and applying the network [...] [edit] ### Non-equilibrium thermodynamics [edit] Diffusion models were introduced in 2015 as a method to train a model that can sample from a highly complex probability distribution. They used techniques from non-equilibrium thermodynamics, especially diffusion.(
- An Introduction to Diffusion Models for Machine Learning - Encord
What are diffusion models?Diffusion models are generative models used for data synthesis. They generate data by applying a sequence of transformations to random noise, producing realistic samples that resemble the training data distribution. [...] Diffusion models are generative models that simulate how data is made by using a series of invertible operations to change a simple starting distribution into the desired complex distribution. Compared to traditional generative models, diffusion models have better image quality, interpretable latent space, and robustness to overfitting. [...] Diffusion models are a promising approach for text-to-video synthesis. The process involves first representing the textual descriptions and video data in a suitable format, such as word embeddings or transformer-based language models for text and video frames in a sequence format.
- Introduction to Diffusion Models for Machine Learning - AssemblyAI
Diffusion Models are generative models, meaning that they are used to generate data similar to the data on which they are trained.Fundamentally, Diffusion Models work by destroying training data through the successive addition of Gaussian noise, and then learning to recover the data by reversing this noising process. After training, we can use the Diffusion Model to generate data by simply passing randomly sampled noise through the learned denoising process. [...] More specifically, a Diffusion Model is a latent variable model which maps to the latent space using a fixed Markov chain. This chain gradually adds noise to the data in order to obtain the approximate posterior q(x1:T|x0), where x1,...,xT are the latent variables with the same dimensionality as x0. In the figure below, we see such a Markov chain manifested for image data. [...] As mentioned above, a Diffusion Model consists of a forward process (or diffusion process), in which a datum (generally an image) is progressively noised, and a reverse process (or reverse diffusion process), in which noise is transformed back into a sample from the target distribution.
- What are Diffusion Models? | IBM
Diffusion models are among the neural network architectures at the forefront of generative AI, most notably represented by popular text-to-image models including Stability AI’s Stable Diffusion, OpenAI’s DALL-E (beginning with DALL-E-2), Midjourney and Google’s Imagen. They improve upon the performance and stability of other machine learning architectures used for image synthesis such as variational autoencoders (VAEs), generative adversarial networks (GANs) and autoregressive models such as [...] Artificial Intelligence # What are diffusion models? ## Authors Dave Bergmann Staff Writer, AI Models IBM Think Cole Stryker Staff Editor, AI Models IBM Think ## What are diffusion models? Diffusion models are generative models used primarily for image generation and other computer vision tasks. Diffusion-based neural networks are trained through deep learning to progressively “diffuse” samples with random noise, then reverse that diffusion process to generate high-quality images.