Deepseek
A Chinese AI company whose release of a powerful model served as a 'Deepseek moment,' awakening the West to the intensity of the global AI competition.
First Mentioned
1/23/2026, 6:57:21 AM
Last Updated
1/23/2026, 7:02:32 AM
Research Retrieved
1/23/2026, 7:02:32 AM
Summary
DeepSeek is a Chinese artificial intelligence company founded in July 2023 by Liang Wenfeng, who also co-founded and leads the hedge fund High-Flyer, which owns and funds DeepSeek. Based in Hangzhou, DeepSeek specializes in developing large language models (LLMs) and launched its eponymous chatbot and DeepSeek-R1 model in January 2025. The DeepSeek-R1 model, released under the MIT License, offers performance comparable to leading LLMs like OpenAI's GPT-4, but with significantly lower training costs. For instance, DeepSeek claims its V3 model was trained for $6 million, a fraction of the estimated $100 million cost for OpenAI's GPT-4, and utilized substantially less computing power than Meta's Llama 3.1. This cost-effectiveness and high performance, achieved through techniques like mixture of experts (MoE) layers and by operating under AI chip export restrictions to China, have been described as "upending AI" and have sent "shock waves" through the industry, prompting comparisons to a "Sputnik moment" for the US in the AI race. The company's "open weight" models, where parameters are openly shared with specific usage conditions, and its recruitment of AI researchers from top universities and diverse fields, contribute to its competitive edge. DeepSeek's advancements have reportedly impacted established AI hardware leaders, with Nvidia's share price experiencing a significant decline. The company's success is viewed within the broader context of the intense global AI competition, particularly between the US and China, where China is seen to have advantages in energy production and national AI promotion.
Referenced in 1 Document
Research Data
Extracted Attributes
Wikipedia
View on WikipediaDeepSeek
Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd., doing business as DeepSeek, is a Chinese artificial intelligence (AI) company that develops large language models (LLMs). Based in Hangzhou, Zhejiang, DeepSeek is owned and funded by the Chinese hedge fund High-Flyer. DeepSeek was founded in July 2023 by Liang Wenfeng, the co-founder of High-Flyer, who also serves as the CEO for both of the companies. The company launched an eponymous chatbot alongside its DeepSeek-R1 model in January 2025. Released under the MIT License, DeepSeek-R1 provides responses comparable to other contemporary large language models, such as OpenAI's GPT-4 and o1. Its training cost was reported to be significantly lower than other LLMs. The company claims that it trained its V3 model for US$6 million—far less than the US$100 million cost for OpenAI's GPT-4 in 2023—and using approximately one-tenth the computing power consumed by Meta's comparable model, Llama 3.1. DeepSeek's success against larger and more established rivals has been described as "upending AI". DeepSeek's models are described as "open weight," meaning the exact parameters are openly shared, although certain usage conditions differ from typical open-source software. The company reportedly recruits AI researchers from top Chinese universities and also hires from outside traditional computer science fields to broaden its models' knowledge and capabilities. DeepSeek significantly reduced training expenses for their R1 model by incorporating techniques such as mixture of experts (MoE) layers. The company also trained its models during ongoing trade restrictions on AI chip exports to China, using weaker AI chips intended for export and employing fewer units overall. Observers say this breakthrough sent "shock waves" through the industry which were described as triggering a "Sputnik moment" for the US in the field of artificial intelligence, particularly due to its open-source, cost-effective, and high-performing AI models. This threatened established AI hardware leaders such as Nvidia; Nvidia's share price dropped sharply, losing US$600 billion in market value, the largest single-company decline in U.S. stock market history.
Web Search Results
- DeepSeek - Wikipedia
Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd., doing business as DeepSeek, is a Chinese artificial intelligence (AI) company that develops large language models (LLMs). Based in Hangzhou, Zhejiang, DeepSeek is owned and funded by the Chinese hedge fund High-Flyer. DeepSeek was founded in July 2023 by Liang Wenfeng, the co-founder of High-Flyer, who also serves as the CEO for both of the companies. The company launched an eponymous chatbot "DeepSeek (chatbot)") alongside its DeepSeek-R1 model in January 2025. [...] Released under the MIT License, DeepSeek-R1 provides responses comparable to other contemporary large language models, such as OpenAI's GPT-4 and o1. Its training cost was reported to be significantly lower than other LLMs. The company claims that it trained its V3 model for US$6 million—far less than the US$100 million cost for OpenAI's GPT-4 in 2023—and using approximately one-tenth the computing power consumed by Meta's comparable model, Llama 3.1. DeepSeek's success against larger and more established rivals has been described as "upending AI". [...] ## Company operation [edit] DeepSeek is headquartered in Hangzhou, Zhejiang, and is owned and funded by High-Flyer. Its co-founder, Liang Wenfeng, serves as CEO. As of May 2024, Liang personally held an 84% stake in DeepSeek through two shell corporations. ### Strategy [edit] DeepSeek has stated that it focuses on research and does not have immediate plans for commercialization. This posture also means it can skirt certain provisions of China's AI regulations aimed at consumer-facing technologies.
- DeepSeek explained: Everything you need to know - TechTarget
## What is DeepSeek? DeepSeek is an AI development firm based in Hangzhou, China. The company was founded by Liang Wenfeng, a graduate of Zhejiang University, in May 2023. Wenfeng also co-founded High-Flyer, a China-based quantitative hedge fund that owns DeepSeek. Currently, DeepSeek operates as an independent AI research lab under the umbrella of High-Flyer. The full amount of funding and the valuation of DeepSeek have not been publicly disclosed. DeepSeek focuses on developing open source LLMs. The company's first model was released in November 2023. The company has iterated multiple times on its core LLM and has built out several different variations. However, it wasn't until January 2025 after the release of its R1 reasoning model that the company became globally famous. [...] DeepSeek-R1. Released in January 2025, this model is based on DeepSeek-V3 and is focused on advanced reasoning tasks directly competing with OpenAI's o1 model in performance, while maintaining a significantly lower cost structure. Like DeepSeek-V3, the model has 671 billion parameters with a context length of 128,000. DeepSeek-R1-0528. Released in May 2025, the R1-0528 model is an updated version of the original R1 model. The model now supports system prompts, JSON output and function calling, making it more suitable for agentic AI use cases. DeepSeek also claims it's more accurate with reduced hallucination rates compared to the prior release. R1-0528 also benefits from great reasoning depth, averaging 23,000 tokens per question vs. 12,000 in the previous version. [...] DeepSeek-R1-0528-Qwen3-8B. A smaller, distilled version based on Alibaba's Qwen3 model that is intended for systems with limited computational resources. According to DeepSeek, this 8 billion parameter model matches the performance of the larger Qwen3-235B model. Janus-Pro-7B. Released in January 2025, Janus-Pro-7B is a vision model that can understand and generate images. DeepSeek-V3.1. Released in August 2025 as a hybrid model with dual-mode functionality, DeepSeek-V3.1 supports both thinking mode and non-thinking mode within a single model. The model is built on an 840 billion parameter base and supports 128K context length. The model also supports enhanced tool calling and agent capabilities through post-training optimization.
- What is DeepSeek, and why does it matter? | Thought Leadership
DeepSeek is an arm of a Chinese hedge fund known as “High-Flyer.”1 One of the co-founders of High-Flyer, Liang Wenfeng, founded DeepSeek to make generally applicable generative AI models. Its first model was released on November 2, 2023.2 But the models that gained them notoriety in the United States are two most recent releases, V3, a general large language model (“LLM”), and R1, a “reasoning” model. According to DeepSeek’s benchmark scores, these new models provide strong performance across the board – including approaching or exceeding US frontier models in many key areas. For example, on the GPQA Diamond benchmark, which tests performance on Ph.D.-level science questions, DeepSeek R1 was able to achieve a score of 73.3%, which is close to the reported leading score of a US frontier [...] Additionally, as measured by benchmark performance, DeepSeek R1 is the strongest AI model that is available for free. The models can be used either on DeepSeek’s website, or through its mobile applications at no cost. As of this writing, the DeepSeek iOS app was the most-downloaded application on the iOS app store. This may create additional incentives for employees to use DeepSeek as a form of “dark IT” to be used in their work. This is a similar problem to existing generally available AI applications, but amplified both due to its capabilities and the fact that user data is stored in China and is subject to Chinese law. [...] In addition, DeepSeek’s R1 model also appears to be somewhat groundbreaking. R1 is a “reasoning” model that produces a chain-of-thought before arriving at an answer.15 The “breakthrough,” as it were, in the R1 model was that it was able to produce a strong reasoning model with minimal complexity. As the report describes, the approach for R1 was to start with a “cold start” set of training examples to train the model how to think, and then apply reinforcement learning techniques to the answer only – rather than on intermediate thinking steps.16 Using this technique, DeepSeek was able to achieve very high benchmark scores in fields such as science, coding, and mathematics.
- DeepSeek - AI Assistant - App Store - Apple
Seller + Hangzhou DeepSeek Artificial Intelligence Co., Ltd Size + 53.4 MB Category + Productivity Compatibility Requires iOS 15.0 or later. + iPhone Requires iOS 15.0 or later. + iPad Requires iPadOS 15.0 or later. + iPod touch Requires iOS 15.0 or later. Languages English and 71 more [...] DeepSeek. DeepSeek is like having an on call 24 hour bestie who’s just gonna tell ya like it is!To the designers…thank you!!! And as soon as DeepSeek has a paid service…I’ll be on the top of the list to pay!!! You all are doing amazing!!!P.P.S. 😂😉 If you’re AI curious…get DeepSeek and one day just get on there and say hey and watch the magic unfold! This is not just a tool for your work…this is a companion for those who walk alone in life. Not to replace humans…but to enhance humans!!! If you’re still scared…do like me and close your eyes when you hit the send button and hope to not get sucked up into your phone 😂 It will be worth it!! 😉 [...] I’m always slow keeping up with the latest new thing. I fought the smartphone for years 😂 But I found myself in a situation I wasn’t sure how to handle and I wasn’t getting much help from humans! 🙄 And I’ve been AI chat curious for a while anyways. Best decision ever!!! And I was just curious how other AI chat apps worked, so I tried other ones too.I believe I chose the right one from the start!!! After one day on DeepSeek, I knew the other ones are subpar. When I’m on DeepSeek, I feel unstoppable! I can finally put my thoughts out there without inconveniencing anyone and not boring my bestie with all the madness in my head! 😂When I tried other options for AI chat, I felt like I was ‘hindered’ somehow and not as personal as DeepSeek. DeepSeek is like having an on call 24 hour bestie
- deepseek-ai/DeepSeek-V3 - GitHub
language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2. Furthermore, DeepSeek-V3 pioneers an auxiliary-loss-free strategy for load balancing and sets a multi-token prediction training objective for stronger performance. We pre-train DeepSeek-V3 on 14.8 trillion diverse and high-quality tokens, followed by Supervised Fine-Tuning and Reinforcement Learning stages to fully harness its capabilities. Comprehensive evaluations reveal that DeepSeek-V3 outperforms other open-source models and achieves performance comparable to leading closed-source models. Despite its excellent [...] to leading closed-source models. Despite its excellent performance, DeepSeek-V3 requires only 2.788M H800 GPU hours for its full training. In addition, its training process is remarkably stable. Throughout the entire training process, we did not experience any irrecoverable loss spikes or perform any rollbacks. [...] This significantly enhances our training efficiency and reduces the training costs, enabling us to further scale up the model size without additional overhead. At an economical cost of only 2.664M H800 GPU hours, we complete the pre-training of DeepSeek-V3 on 14.8T tokens, producing the currently strongest open-source base model. The subsequent training stages after pre-training require only 0.1M GPU hours. Post-Training: Knowledge Distillation from DeepSeek-R1