NPUs

Technology

Neural Processing Units, specialized hardware for efficiently running AI models locally on devices like PCs.


First Mentioned

1/22/2026, 4:20:10 AM

Last Updated

1/22/2026, 4:26:30 AM

Research Retrieved

1/22/2026, 4:26:30 AM

Summary

Neural Processing Units (NPUs) are specialized microprocessors designed to accelerate artificial intelligence and machine learning tasks by mimicking the processing functions of the human brain. Unlike general-purpose CPUs or GPUs, NPUs are optimized for neural network computations, specifically matrix multiplication and tensor math, offering superior parallel processing and power efficiency for low-precision arithmetic. Microsoft CEO Satya Nadella emphasizes NPUs as a critical component of a "Hybrid AI" strategy, enabling local models to run on Windows PCs to support autonomous agents and "infinite minds" for knowledge workers. This technology is central to the evolution of the modern tech stack, with major hardware vendors like Intel, AMD, Apple, and Qualcomm integrating NPUs into consumer electronics to enhance tasks such as image recognition, natural language processing, and on-device security.

Referenced in 1 Document
Research Data
Extracted Attributes
  • Full Name

    Neural Processing Unit

  • Optimization

    Low-precision matrix multiplication (e.g., FP16, INT8)

  • Key Operations

    Scalar, vector, and tensor math; convolution; dot products

  • Platform Support

    Windows, Android (LiteRT), iOS/macOS (CoreML)

  • Primary Function

    Accelerating artificial intelligence and machine learning applications

  • Alternative Names

    AI accelerator, deep learning processor

  • Core Applications

    Image recognition, natural language processing, facial recognition, threat detection

Timeline
  • Microsoft CEO Satya Nadella discusses the role of NPUs in Hybrid AI and local model execution at the Davos fireside chat. (Source: Document 4e50eb82-56c2-4d20-910f-9a43912c1cd7)

    2024-01-15

  • Devices like Microsoft's Copilot+ PC begin integrating NPUs as a standard component for AI-driven digital workplaces. (Source: Atos Blog)

    2024-05-20

NPU

NPU may refer to:

Web Search Results
  • Neural processing unit - Wikipedia

    Appearance From Wikipedia, the free encyclopedia Hardware acceleration unit for artificial intelligence tasks A neural processing unit (NPU), also known as AI accelerator or deep learning processor, is a class of specialized hardware accelerator or computer system designed to accelerate artificial intelligence (AI) and machine learning applications, including artificial neural networks and computer vision. ## Use [edit] [...] Although NPUs are tailored for low-precision (e.g. FP16, INT8) matrix multiplication operations, they can be used to emulate higher-precision matrix multiplications in scientific computing. As modern GPUs place much focus on making the NPU part fast, using emulated FP64 (Ozaki scheme) on NPUs can potentially outperform native FP64: this has been demonstrated using FP16-emulated FP64 on NVIDIA TITAN RTX and using INT8-emulated FP64 on NVIDIA consumer GPUs and the A100 GPU. (Consumer GPUs are especially benefitted by this scheme as they have small amounts of FP64 hardware capacity, showing a 6× speedup.) Since CUDA Toolkit 13.0 Update 2, cuBLAS automatically uses INT8-emulated FP64 matrix multiplication of the equivalent precision if it's faster than native. This is in addition to the [...] ## Programming [edit] An operating system or a higher-level library may provide application programming interfaces such as TensorFlow Lite with LiteRT Next (Android) or CoreML (iOS, macOS). Formats such as ONNX are used to represent trained neural networks. Consumer CPU-integrated NPUs are accessible through vendor-specific APIs. AMD (Ryzen AI), Intel (OpenVINO), Apple silicon (CoreML), and Qualcomm (SNPE) each have their own APIs, which can be built upon by a higher-level library. GPUs generally use existing GPGPU pipelines such as CUDA and OpenCL adapted for lower precisions and specialized matrix-multiplication operations. Vulkan is also being used. Custom-built systems such as the Google TPU "TPU (computing)") use private interfaces.

  • What is a Neural Processing Unit (NPU)? - IBM

    # What is a neural processing unit (NPU)? ## Authors Josh Schneider Staff Writer IBM Think Ian Smalley Staff Editor IBM Think ## What is a neural processing unit (NPU)? A neural processing unit (NPU) is a specialized computer microprocessor designed to mimic the processing function of the human brain. They are optimized for artificial intelligence (AI) neural networks, deep learning and machine learning tasks and applications. Differing from general-purpose central processing units (CPUs) or graphics processing units (GPUs), NPUs are tailored to accelerate AI tasks and workloads, such as calculating neural network layers composed of scalar, vector and tensor math. [...] ## How NPUs work Based on the neural networks of the brain, neural processing units (NPUs) work by simulating the behavior of human neurons and synapses at the circuit layer. This allows for the processing of deep learning instruction sets in which one instruction completes the processing of a set of virtual neurons. Unlike traditional processors, NPUs are not built for precise computations. Instead, NPUs are purpose-built for problem-solving functions and can improve over time, learning from different types of data and inputs. Taking advantage of machine learning, AI systems incorporating NPUs can provide customized solutions faster, without the need for more manual programming. [...] As a standout feature, NPUs offer superior parallel processing, and are able to accelerate AI operations through simplified high-capacity cores that are freed from performing multiple types of tasks. An NPU includes specific modules for multiplication and addition, activation functions, 2D data operations and decompression. The specialized multiplication and addition module is used to perform operations relevant to the processing of neural network applications, such as calculating matrix multiplication and addition, convolution, dot product and other functions.

  • NPUs: Fueling the future of AI in the Digital Workplace - Atos

    ## The ABCs of NPUs An NPU is a dedicated AI co-processor. It is different from a CPU (general-purpose processor) or GPU (graphics processor) in that it is built specifically for neural network operations and parallel data processing. [...] ## You are here Blog NPUs: Fueling the future of AI in the Digital Workplace # NPUs: Fueling the future of AI in the Digital Workplace Soon, your work laptop will not just have a powerful CPU and GPU; it will also come with an NPU. A Neural Processing Unit (NPU) is an embedded “AI brain” that can transform the way you work. NPUs are specialized processors designed to accelerate machine learning and AI tasks on your device. Devices have been fitted with NPUs since 2024, and Microsoft’s Copilot+ PC is a well-known example. These specialized processors are designed to accelerate AI tasks, enhance security, and ensure data sovereignty, making them a must-have for your next work laptop. ## The ABCs of NPUs [...] Security and convenience: NPUs enable fast, secure user authentication and threat detection. Logging in with facial recognition or even voice recognition becomes quicker and more accurate. Security software can use on-device AI to detect unusual activity like potential malware behavior in real time, adding an extra layer of defence that runs continuously without slowing down your work. All this boosts enterprise security in a transparent manner.

  • What Is a Neural Processing Unit (NPU)? - Pure Storage

    ### What Is a Neural Processing Unit? A neural processing unit is a specialized piece of hardware that is designed with a focus on accelerating neural network computations. Thanks to their design, NPUs drastically enhance the speed and efficiency of AI systems. Don't mistake NPUs for an upgraded piece of familiar tech: NPUs are a huge leap forward for AI/ML processing. Optimized for running the algorithms that make AI and ML possible, NPUs are particularly efficient at tasks like image recognition and natural language processing, which require fast processing of massive amounts of multimedia data. [...] ### How Does a Neural Processing Unit Work? NPUs are specially designed to process machine learning algorithms. While GPUs are very good at processing parallel data, NPUs are purpose-built for the computations necessary to run neural networks responsible for AI/ML processes. Machine learning algorithms are the foundation and scaffolding upon which AI applications get built. As neural networks and machine learning computations have become increasingly complex, the need for a custom solution has emerged. [...] NPUs accelerate deep learning algorithms by natively executing many of the specific operations neural networks need. Rather than build the framework for running those operations, or running environments that allow for those advanced computations, NPUs are custom-built to execute AI/ML operations efficiently. NPUs and their built-in capability for high-performance computation have drastic impacts on AI performance. Matrix multiplications and convolutions are specialized tasks AI processes depend on and NPUs excel at. Image recognition and language processing are the places NPUs are currently transforming the industry, boasting faster inference times and lower power consumption, which can impact an organization’s bottom line. ### Applications of Neural Processing Units

  • What is an NPU? A Penn expert explains

    ### What is an NPU? A neural processing unit is a piece of hardware, a chip, that’s customized to do particularly well on the matrix arithmetic that AI relies on. It is intended to support inference, which means responding to a request to a trained model. Suppose you have downloaded a trained model onto your device and now you want to ask it questions or issue prompts. The model is probably small because it’s sitting within your personal device and it’s going to be able to perform tasks locally without going to some remote data center across the internet. ### And that’s the key distinction, right, that it’s not like ChatGPT where you’re typing a question, a prompt, and it’s being processed elsewhere? [...] Answering this question is Benjamin C. Lee, a professor in the departments of Electrical and Systems Engineering and Computer and Information Science at the School of Engineering and Applied Science. Lee began his career as a computer architect who, he says, “thinks a lot about processors and hardware systems.” He explains that while general purpose central processing units or CPUs are the bread and butter for processor designers, a smaller cohort works on NPUs—one that is poised to grow in the years ahead. “NPUs, that’s where the frontier is and the number of design teams there is much, much smaller,” he says. [...] Skip to Content Skip to Content News from University of Pennsylvania Try Advanced Search - Science & Technology # What is an NPU? A Penn expert explains Benjamin C. Lee, a professor of electrical and systems engineering, explains what a neural processing unit (NPU) is and why it matters in the age of artificial intelligence. 5 min. read Increasingly, neural processing units (NPUs) are making their way into consumer electronics: laptops, high-end tablets, phones, and more. But what do they do, and why are they suddenly showing up?