28-04-2025
- Business
- South China Morning Post
DeepSeek speculation swirls online over Chinese AI start-up's much-anticipated R2 model
Advertisement
The latest speculation about
DeepSeek-R2 – the successor to the
R1 , reasoning model, which was released in January – that surfaced over the weekend included the product's imminent launch and the purported new benchmarks it set for cost-efficiency and performance.
That reflects heightened online interest in DeepSeek after it generated worldwide attention from late December 2024 to January by consecutively releasing two advanced open-source AI models,
V3 and R1, which were built at a fraction of the cost and computing power that major tech companies typically require for large language model (LLM) projects. LLM refers to the technology underpinning
generative AI services such as
ChatGPT
According to posts on Chinese stock-trading social-media platform Jiuyangongshe, R2 was said to have been developed with a so-called hybrid mixture-of-experts (MoE) architecture, with a total of 1.2 trillion parameters, making it 97.3 per cent cheaper to build than
OpenAI 's
GPT-4o
MoE is a machine-learning approach that divides an AI model into separate sub-networks, or experts – each focused on a subset of the input data – to jointly perform a task. This is said to greatly reduce computation costs during pre-training and achieve faster performance during inference time.
Advertisement
In machine learning, parameters are the variables present in an AI system during training, which helps establish how data prompts yield the desired output.