logo
Ant Group's use of China-made GPUs, not Nvidia, cuts AI model training costs by 20%

Ant Group's use of China-made GPUs, not Nvidia, cuts AI model training costs by 20%

Ant Group , the fintech affiliate of Alibaba Group Holding , is able to train large language models (LLMs) using locally produced graphics processing units (GPUs), reducing reliance on Nvidia's advanced chips and cutting training costs by 20 per cent, according to a research paper and media reports.
Advertisement
Ant's Ling team, responsible for LLM development, revealed that its Ling-Plus-Base model, a Mixture-of-Experts (MoE) model with 300 billion parameters, can be 'effectively trained on lower-performance devices'. The finding was published in a recent paper on arXiv, an open-access platform for professionals in the scientific community.
By avoiding high-performance GPUs, the model reduces computing costs by a fifth in the pre-training process, while still achieving performance comparable to other models such as Qwen2.5-72B-Instruct and DeepSeek-V2.5-1210-Chat, according to the paper.
The development positions the Hangzhou-based fintech giant alongside domestic peers like
DeepSeek and
ByteDance in reducing reliance on advanced Nvidia chips, which are subject to strict US export controls.
'These results demonstrate the feasibility of training state-of-the-art large-scale MoE models on less powerful hardware, enabling a more flexible and cost-effective approach to foundational model development with respect to computing resource selection,' the team wrote in the paper.
Advertisement

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Huawei claims better AI training method than DeepSeek using own Ascend chips
Huawei claims better AI training method than DeepSeek using own Ascend chips

South China Morning Post

timea day ago

  • South China Morning Post

Huawei claims better AI training method than DeepSeek using own Ascend chips

Researchers working on Huawei Technologies ' large language model (LLM) Pangu claimed they have improved on DeepSeek's original approach to training artificial intelligence (AI) by leveraging the US-sanctioned company's proprietary hardware. Advertisement A paper – published last week by Huawei's Pangu team, which comprises 22 core contributors and 56 additional researchers – introduced the concept of Mixture of Grouped Experts (MoGE). It is an upgraded version of the Mixture of Experts (MoE) technique that has been instrumental in DeepSeek's cost-effective AI models. While MoE offers low execution costs for large model parameters and enhanced learning capacity, it often results in inefficiencies, according to the paper. This is because of the uneven activation of so-called experts, which can hinder performance when running on multiple devices in parallel. In contrast, the improved MoGE 'groups the experts during selection and better balances the expert workload', researchers said. In AI training, 'experts' refer to specialised sub-models or components within a larger model, each designed to handle specific tasks or types of data. This allows the overall system to take advantage of diverse expertise to enhance performance. 01:38 China a 'key market', says Nvidia CEO Huang during Beijing visit as US bans AI chips China a 'key market', says Nvidia CEO Huang during Beijing visit as US bans AI chips The advancement comes at a crucial time, as Chinese AI companies are focused on enhancing model training and inference efficiency through algorithmic improvements and a synergy of hardware and software, despite US restrictions on the export of advanced AI chips like those from Nvidia

Ant International rolls out agentic AI platform in pursuit of ‘holy grail' for fintech
Ant International rolls out agentic AI platform in pursuit of ‘holy grail' for fintech

South China Morning Post

timea day ago

  • South China Morning Post

Ant International rolls out agentic AI platform in pursuit of ‘holy grail' for fintech

Ant International, a spin-off from Ant Group, has launched an agentic artificial intelligence (AI) tool called the Alipay+ GenAI Cockpit, to deliver more efficient payments and compliance capabilities in the financial services industry. The Singapore-based company said the tool, described as an AI-as-a-Service (AIaaS) platform, enables fintech companies to build agentic AI systems as well as 'AI-native financial services' for facilitating payments and compliance checks. Ant Group is the fintech affiliate of Alibaba Group Holding, owner of the South China Morning Post. AI agents are programs that are capable of autonomously performing tasks on behalf of a user or another system. Essentially, these agents create a plan of specific tasks and subtasks to complete a goal using its available resources. 'The FinAI sector is at its big-bang moment,' Jiangming Yang, chief innovation officer of Ant International, said in a statement. 'We are eager to work with the industry to evolve and expand the toolbox as well as this ecosystem to help financial businesses scale their growth faster and better.' Applying AI to the finance world is the 'holy grail of the current AI revolution,' according to Ant. A banner for Alipay+ is seen at the UEFA Euro 2024 Group D match between Austria and France in Dusseldorf, Germany, on June 17, 2024. Photo: Xinhua The latest initiative of Ant International reflects efforts of Big Tech firms to leverage their AI expertise to provide tools that will keep value chain partners in their ecosystems.

US venture capital firms visit China to study its AI scene as DeepSeek rekindles interest
US venture capital firms visit China to study its AI scene as DeepSeek rekindles interest

South China Morning Post

timea day ago

  • South China Morning Post

US venture capital firms visit China to study its AI scene as DeepSeek rekindles interest

Joshua Kushner's Thrive Capital and investment firm Capital Group have in recent months visited China to learn about its artificial intelligence (AI) industry, joining a growing number of US investors rekindling interest in the country after DeepSeek's advances stunned Silicon Valley. Advertisement Senior people at Thrive met companies and funds in China to discuss AI, people familiar with their visit to the country said. Kushner did not join the delegation, one of the people said, asking to remain anonymous discussing a private event. At the same time, Capital Group – one of the world's largest funds – dispatched senior executives to China to find out more about the AI scene, the people said. Joshua Kushner, the brother of US President Donald Trump's son-in-law Jared Kushner, seen with his wife Karlie Kloss. Photo: Invision/AP The outreach underscores tentative but mounting interest in a once-overlooked Chinese AI industry that is getting reassessed since DeepSeek proved a home-grown firm can design a platform on par with the likes of OpenAI and Anthropic.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store