Llama 3

Technology

A powerful open-source model released by Meta, considered a strong competitor to models like GPT-4.

First Mentioned

10/12/2025, 6:49:23 AM

Last Updated

10/12/2025, 6:50:46 AM

Research Retrieved

10/12/2025, 6:50:46 AM

Summary

Llama 3 is the latest generation of large language models (LLMs) developed by Meta AI, succeeding Llama 2. Released on April 18, 2024, it is available in various parameter sizes, including 8B, 70B, and later 405B with Llama 3.1, and is designed with a decoder-only transformer architecture. Unlike earlier Llama versions, Llama 3 permits some commercial use and is offered in both pre-trained and instruction fine-tuned variants. Meta has integrated Llama 3 into its virtual assistant features within Facebook, WhatsApp, and a standalone Meta AI website, significantly enhancing their capabilities. It is recognized as a leading open-source competitor in the AI landscape, with its 70B model demonstrating performance comparable to or exceeding models like Gemini Pro 1.5 and Claude 3 Sonnet on various benchmarks.

Referenced in 1 Document

Document 8905c897...

Research Data

Extracted Attributes

Type
Large Language Model (LLM)
License
Permits some commercial use
Developer
Meta AI
Core Values
Openness, inclusivity, helpfulness
Architecture
Decoder-only transformer
Capabilities
Multilinguality, coding, reasoning, tool usage
Training Data
Over 15 trillion tokens from publicly available sources
Parameter Sizes
8B, 70B (initial release); 405B (Llama 3.1); up to 2 trillion (Llama family)
Intended Use Cases
Commercial and research use in English; assistant-like chat (instruction-tuned), natural language generation (pre-trained)
Tokenizer Vocabulary
128K tokens
Code in Training Data
4 times more code than Llama 2's dataset
Performance (General)
Comparable quality to leading language models such as GPT-4
Training Data Size Comparison
7 times larger than Llama 2's dataset
Performance (70B model, April 2024)
Beats Gemini Pro 1.5 and Claude 3 Sonnet on most benchmarks

Timeline

First Llama model released by Meta AI. (Source: Wikipedia)
2023-02
Meta released Llama 3 with 8B and 70B parameters, along with instruction fine-tuned versions. Virtual assistant features powered by Llama 3 were added to Facebook and WhatsApp in select regions, and a standalone website was launched. (Source: Wikipedia, Web Search)
2024-04-18
Llama 3.1 was released with three sizes: 8B, 70B, and 405B parameters. (Source: Web Search)
2024-07-23
The paper "The Llama 3 Herd of Models" was submitted to arXiv. (Source: Web Search (arXiv))
2024-07-31
The paper "The Llama 3 Herd of Models" was last revised on arXiv. (Source: Web Search (arXiv))
2024-11-23
Llama 4, the next version in the Llama family, is released. (Source: Wikipedia)
2025-04

Wikipedia

View on Wikipedia

Llama (language model)

Llama (Large Language Model Meta AI) is a family of large language models (LLMs) released by Meta AI starting in February 2023. The latest version is Llama 4, released in April 2025. Llama models come in different sizes, ranging from 1 billion to 2 trillion parameters. Initially only a foundation model, starting with Llama 2, Meta AI released instruction fine-tuned versions alongside foundation models. Model weights for the first version of Llama were only available to researchers on a case-by-case basis, under a non-commercial license. Unauthorized copies of the first model were shared via BitTorrent. Subsequent versions of Llama were made accessible outside academia and released under licenses that permitted some commercial use. Alongside the release of Llama 3, Meta added virtual assistant features to Facebook and WhatsApp in select regions, and a standalone website. Both services use a Llama 3 model.

Web Search Results

Llama 3: Meta's New AI Model - GeeksforGeeks
Llama 3 is Meta's latest and most powerful large language model (LLM). Imagine a super AI trained on massive amounts of text. Llama 3 excels at understanding language, making Meta AI (their virtual assistant) in Facebook, Messenger, etc. much smarter! Expect faster responses, better conversations, and even creative text features like summaries or different writing styles. This is a big leap in AI development! > Also Read: What is Llama2? Meta’s AI explained [...] Meta AI, powered by the Llama 3 large language model (LLM), is a huge advancement in AI technology. Llama 3 boasts superior capabilities in Natural Language Processing (NLP), leading to a more intelligent and versatile Meta AI. Expect enhanced experiences across Meta's platforms, from smarter chatbots and richer search functionalities to creative text formats like poems or summaries. This is a big step towards a future where AI integrates into our daily lives, entrusting us to interact with [...] Meta AI just got a major upgrade with Llama 3, a powerful new large language model (LLM). This cutting-edge technology promises to change the way you interact with Meta's platforms like Facebook, Messenger, and WhatsApp. Llama 3 boasts superior Natural Language Processing (NLP), enabling Meta AI to deliver smarter assistance, richer search results, and even generate creative text formats. Get ready for an AI experience like never before!
[2407.21783] The Llama 3 Herd of Models - arXiv
> Abstract:Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading [...] language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety. The paper also presents the results of experiments in which we integrate image, video, and speech capabilities into Llama 3 via a compositional approach. We observe this approach performs competitively with the state-of-the-art on image, video, and speech recognition tasks. [...] [2407.21783] The Llama 3 Herd of Models Skip to main content Image 1: Cornell University Logo We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors.Donate Submitted on 31 Jul 2024 ([v1), last revised 23 Nov 2024 (this version, v3)] Title:The Llama 3 Herd of Models
Introducing Meta Llama 3: The most capable openly available LLM ...
In line with our design philosophy, we opted for a relatively standard decoder-only transformer architecture in Llama 3. Compared to Llama 2, we made several key improvements. Llama 3 uses a tokenizer with a vocabulary of 128K tokens that encodes language much more efficiently, which leads to substantially improved model performance. To improve the inference efficiency of Llama 3 models, we’ve adopted grouped query attention (GQA) across both the 8B and 70B sizes. We trained the models on [...] Our new 8B and 70B parameter Llama 3 models are a major leap over Llama 2 and establish a new state-of-the-art for LLM models at those scales. Thanks to improvements in pretraining and post-training, our pretrained and instruction-fine-tuned models are the best models existing today at the 8B and 70B parameter scale. Improvements in our post-training procedures substantially reduced false refusal rates, improved alignment, and increased diversity in model responses. We also saw greatly improved [...] To train the best language model, the curation of a large, high-quality training dataset is paramount. In line with our design principles, we invested heavily in pretraining data. Llama 3 is pretrained on over 15T tokens that were all collected from publicly available sources. Our training dataset is seven times larger than that used for Llama 2, and it includes four times more code. To prepare for upcoming multilingual use cases, over 5% of the Llama 3 pretraining dataset consists of
meta-llama/Meta-Llama-3-8B - Hugging Face
The core values of Llama 3 are openness, inclusivity and helpfulness. It is meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 3 addresses users and their needs as they are, without insertion unnecessary judgment or normativity, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. [...] Variations Llama 3 comes in two sizes — 8B and 70B parameters — in pre-trained and instruction tuned variants. Input Models input text only. Output Models generate text and code only. Model Architecture Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. [...] Intended Use Cases Llama 3 is intended for commercial and research use in English. Instruction tuned models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. Out-of-scope Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3 Community License. Use in languages other than English\\.
Llama (language model) - Wikipedia
On April 18, 2024, Meta released Llama 3 with two sizes: 8B and 70B parameters. The models have been pre-trained on approximately 15 trillion tokens of text gathered from “publicly available sources” with the instruct models fine-tuned on “publicly available instruction datasets, as well as over 10M human-annotated examples". Meta AI's testing showed in April 2024 that Llama 3 70B was beating Gemini "Gemini (chatbot)") Pro 1.5 and Claude "Claude (language model)") 3 Sonnet on most benchmarks. [...] During an interview with Dwarkesh Patel, Mark Zuckerberg said that the 8B version of Llama 3 was nearly as powerful as the largest Llama 2. Compared to previous models, Zuckerberg stated the team was surprised that the 70B model was still learning even at the end of the 15T tokens training. The decision was made to end training to focus GPU power elsewhere.( Llama 3.1 was released on July 23, 2024, with three sizes: 8B, 70B, and 405B parameters.( ### Llama 4 [...] ### Llama 3 [edit&action=edit&section=6 "Edit section: Llama 3")]

Wikidata

View on Wikidata

Instance Of
Q124629760

Location Data

3, Calle de la Llama, Grupo Juan de Herrera, Miravalles, Sierrapando, Torrelavega, Cantabria, 39300, España

house

Coordinates: 43.3483205, -4.0449512

Open Map

Llama 3

First Mentioned

Last Updated

Research Retrieved

Summary

Referenced in 1 Document

Research Data

Extracted Attributes

Type

License

Developer

Core Values

Architecture

Capabilities

Training Data

Parameter Sizes

Intended Use Cases

Tokenizer Vocabulary

Code in Training Data

Performance (General)

Training Data Size Comparison

Performance (70B model, April 2024)