
DeepSeek speculation swirls online over Chinese AI start-up's much-anticipated R2 model
Advertisement
The latest speculation about
DeepSeek-R2 – the successor to the
R1 , reasoning model, which was released in January – that surfaced over the weekend included the product's imminent launch and the purported new benchmarks it set for cost-efficiency and performance.
That reflects heightened online interest in DeepSeek after it generated worldwide attention from late December 2024 to January by consecutively releasing two advanced open-source AI models,
V3 and R1, which were built at a fraction of the cost and computing power that major tech companies typically require for large language model (LLM) projects. LLM refers to the technology underpinning
generative AI services such as
ChatGPT
According to posts on Chinese stock-trading social-media platform Jiuyangongshe, R2 was said to have been developed with a so-called hybrid mixture-of-experts (MoE) architecture, with a total of 1.2 trillion parameters, making it 97.3 per cent cheaper to build than
OpenAI 's
GPT-4o
MoE is a machine-learning approach that divides an AI model into separate sub-networks, or experts – each focused on a subset of the input data – to jointly perform a task. This is said to greatly reduce computation costs during pre-training and achieve faster performance during inference time.
Advertisement
In machine learning, parameters are the variables present in an AI system during training, which helps establish how data prompts yield the desired output.

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles
.jpg%3Fitok%3DxzH4i0ZA&w=3840&q=100)

South China Morning Post
19 hours ago
- South China Morning Post
How Hangzhou's ‘Six Little Dragons' built a new Chinese tech hub
Read more: Hangzhou is rapidly gaining recognition as a leading technology centre in China, primarily because of the rise of local start-ups known as the 'Six Little Dragons'. One of the hottest new companies is DeepSeek, which captured global attention in early 2025 with its cost-efficient artificial intelligence (AI) models. The other 'dragons' include robotics firms Unitree and Deep Robotics, video game studio Game Science, brain-machine interface innovator BrainCo, and 3D interior design software developer Manycore. The cluster of hi-tech companies forming in Hangzhou is no coincidence. The city is already home to Zhejiang University and Alibaba, which owns the South China Morning Post. It attracts talent from across the country, and its business-friendly policies have nurtured many start-ups.


South China Morning Post
3 days ago
- South China Morning Post
ByteDance, SenseTime unveil model updates as China's AI race heats up
China's artificial intelligence (AI) market sees heightened competition, as various model providers – from SenseTime to ByteDance – step up efforts to enhance their services. Hong Kong-listed SenseTime has upgraded its Cantonese-speaking chatbot, Sensechat, with a comprehensive set of new features that include real-time audio and video-interaction capabilities, according to the company's announcement on Thursday. Other enhancements include visual reasoning capabilities, which allow Sensechat to 'see' and 'think' while engaging with users. The feature was made possible by the multimodal reasoning capabilities of SenseTime's SenseNova V6 AI model , the company said. Multimodal models are designed to understand multiple types of input data such as text, video and audio, unlike traditional models that only handle one type. SenseTime is one of China's pioneering AI companies. Photo: Reuters That upgrade comes a day after TikTok parent ByteDance launched a suite of new AI models and tools at reduced prices, underscoring the intensifying competition in the domestic market after Chinese start-up DeepSeek's cost-effective products garnered global attention.


South China Morning Post
5 days ago
- South China Morning Post
France's Mistral launches Europe's first AI reasoning model
Mistral on Tuesday launched Europe's first AI reasoning model, which uses logical thinking to create a response, as it tries to keep pace with American and Chinese rivals at the forefront of AI development. The French start-up has attempted to differentiate itself by championing its European roots, winning the support of French President Emmanuel Macron, as well as making some of its models open source in contrast to the proprietary offerings of OpenAI or Alphabet's Google. Mistral is considered Europe's best shot at having a home-grown AI competitor, but has lagged behind in terms of market share and revenue. Reasoning models use chain-of-thought techniques - a process that generates answers with intermediate reasoning abilities when solving complex problems. DeepSeek broke through as a viable competitor in January through its low-cost, open-sourced AI models. Photo: AFP They could also be a promising path forward in advancing AI's capabilities as the traditional approach of building ever-bigger large language models by adding more data and computing power begins to hit limitations. For Mistral, which was valued by venture capitalists at US$6.2 billion, an industry shift away from 'scaling up' could give it a window to catch up against better capitalised rivals. China's DeepSeek broke through as a viable competitor in January through its low-cost, open-sourced AI models, including one for reasoning. OpenAI was the first to launch its reasoning models last year, followed by Google a few months later. Meta has not yet released a stand-alone reasoning model, though it said its latest top-shelf model has reasoning capabilities.