Latest news with #DeepSeek-V2.5-1210-Chat

Ant Group's use of China-made GPUs, not Nvidia, cuts AI model training costs by 20%

South China Morning Post

25-03-2025

Business
South China Morning Post

Ant Group's use of China-made GPUs, not Nvidia, cuts AI model training costs by 20%

Ant Group , the fintech affiliate of Alibaba Group Holding , is able to train large language models (LLMs) using locally produced graphics processing units (GPUs), reducing reliance on Nvidia's advanced chips and cutting training costs by 20 per cent, according to a research paper and media reports. Advertisement Ant's Ling team, responsible for LLM development, revealed that its Ling-Plus-Base model, a Mixture-of-Experts (MoE) model with 300 billion parameters, can be 'effectively trained on lower-performance devices'. The finding was published in a recent paper on arXiv, an open-access platform for professionals in the scientific community. By avoiding high-performance GPUs, the model reduces computing costs by a fifth in the pre-training process, while still achieving performance comparable to other models such as Qwen2.5-72B-Instruct and DeepSeek-V2.5-1210-Chat, according to the paper. The development positions the Hangzhou-based fintech giant alongside domestic peers like DeepSeek and ByteDance in reducing reliance on advanced Nvidia chips, which are subject to strict US export controls. 'These results demonstrate the feasibility of training state-of-the-art large-scale MoE models on less powerful hardware, enabling a more flexible and cost-effective approach to foundational model development with respect to computing resource selection,' the team wrote in the paper. Advertisement

Latest news with #DeepSeek-V2.5-1210-Chat

Ant Group's use of China-made GPUs, not Nvidia, cuts AI model training costs by 20%

Get Started Now: Download the App