logo
Huawei's new AI CloudMatrix cluster beats Nvidia's GB200 by brute force, uses 4X the power

Huawei's new AI CloudMatrix cluster beats Nvidia's GB200 by brute force, uses 4X the power

Yahoo20-04-2025

When you buy through links on our articles, Future and its syndication partners may earn a commission.
Unable to use leading-edge process technologies to produce its high-end processors for AI, Huawei has to rely on brute force – install more processors than its industry competitors to achieve comparable performance for AI. To do this, Huawei took a multifaceted strategy that includes a dual-chiplet HiSilicon Ascend 910C processor, optical interconnections, and the Huawei AI CloudMatrix 384 rack-scale solution that relies on proprietary software, reports SemiAnalysis. The whole system provides a 2.3X lower performance per watt than Nvidia's GB200 NVL72, but it still enables Chinese companies to train advanced AI models.
Huawei's CloudMatrix 384 is a rack-scale AI system composed of 384 Ascend 910C processors arranged in a fully optical, all-to-all mesh network. The system spans 16 racks, including 12 compute racks housing 32 accelerators each and four networking racks facilitating high-bandwidth interconnects using 6,912 800G LPO optical transceivers.
Unlike traditional systems that use copper wires for interconnections, CloudMatrix relies entirely on optics for both intra- and inter-rack connectivity, enabling extremely high aggregate communication bandwidth. The CloudMatrix 384 is an enterprise-grade machine that features fault-tolerant capabilities and is designed for scalability.
In terms of performance, the CloudMatrix 384 delivers approximately 300 PFLOPs of dense BF16 compute, which is nearly two times the throughput of Nvidia's GB200 NVL72 system (which delivers about 180 BF16 PFLOPs). It also offers 2.1 times more total memory bandwidth despite using HBM2E and over 3.6 times greater HBM capacity. The machine also features 2.1 times higher scale-up bandwidth and 5.3 times scale-out bandwidth thanks to its optical interconnections.
However, these performance advantages come with a tradeoff: The system is 2.3 times less power-efficient per FLOP, 1.8 times less efficient per TB/s of memory bandwidth, and 1.1 times less efficient per TB of HBM memory compared to Nvidia.
But this does not really matter, as Chinese companies (including Huawei) cannot access Nvidia's GB200 NVL72 anyway. So if they want to get truly high performance for AI training, they will be more than willing to invest in Huawei's CloudMatrix 384.
At the end of the day, the average electricity price in mainland China has declined from $90.70 MWh in 2022 to $56 MWh in some regions in 2025, so users of Huawei's CM384 aren't likely to go bankrupt because of power costs. So, for China, where the energy is abundant, but advanced silicon is constrained, Huawei's approach to AI seems to work just fine.
When we first encountered Huawei's HiSilicon Ascend 910C processor several months ago, it was a die shot of its compute chiplet, presumably produced by SMIC, which had an I/O that was supposed to connect it to its I/O die. This is why we thought it was a processor with one compute chiplet. We were wrong.
Apparently, the HiSilicon Ascend 910C is a dual-chiplet processor with eight HBM2E memory modules and without an I/O die that resembles AMD's Instinct MI250X and Nvidia's B200. The unit delivers 780 BF16 TFLOPS compared to MI250X's 383 BF16 TFLOPS and B200's 2.25 - 2.5 BF16 TFLOPS.
The HiSilicon Ascend 910C was designed in China for large-scale training and inference workloads. The processor is was designed using advanced EDA tools from well-known companies and can be produced using 7nm-class process technologies. SemiAnalysis reports that while SMIC can produce compute chiplets for the Ascend 910C, the vast majority of Ascend 910C chiplets used by Huawei were made by TSMC using workarounds involving third-party entities like Sophgo, allowing Huawei to obtain wafers despite U.S. restrictions. It is estimated that Huawei acquired enough wafers for over a million Ascend 910C processors from 2023 to 2025. Nonetheless, as SMIC's capabilities improve, Huawei can outsource more production to the domestic foundry.
The Ascend 910C uses HBM2E memory, most of which is sourced from Samsung using another proxy, CoAsia Electronics. CoAsia shipped HBM2E components to Faraday Technology, a design services firm, which then worked with SPIL to assemble HBM2E stacks alongside low-performance 16nm logic dies. These assemblies technically complied with U.S. export controls because they did not exceed any thresholds outlined by the U.S. regulations. The system-in-package (SiP) units were shipped to China only to have their HBM2E stacks desoldered to be shipped to Huawei, which then reinstalled them on its Ascend 910C SiPs.
In performance terms, the Ascend 910C is considerably less powerful on a per-chip basis than Nvidia's latest B200AI GPUs, but Huawei's system design strategy compensates for this by scaling up the number of chips per system.
Indeed, as the name suggests, the CloudMatrix 384 is a high-density computing cluster composed of 384 Ascend 910C AI processors, physically organized into a 16-rack system with 32 AI accelerators per rack. Within this layout, 12 racks house compute modules, while four additional racks are allocated for communication switching. Just like with Nvidia's architecture, all Ascend 910Cs can communicate with each other as they are interconnected using a custom mesh network.
However, a defining feature of the CM384 is its exclusive reliance on optical links for all internal communication within and between racks. It incorporates 6,912 linear pluggable optical (LPO) transceivers, each rated at 800 Gbps, resulting in a total internal bandwidth exceeding 5.5 Pbps (687.5 TB/s) at low latency and with minimal signal integrity losses. The system supports both scale-up and scale-out topologies: scale-up via the full-mesh within the 384 processors, and scale-out through additional inter-cluster connections, which enables deployment in larger hyperscale environments while retaining tight compute integration.
With 384 processors, Huawei's CloudMatrix 384 delivers 300 PFLOPs of dense BF16 compute performance, which is 166% higher compared to Nvidia's GB200 NVL72. However, all system power (including networking and storage) of the CM384 is around 559 kW, whereas Nvidia's GB200 NVL72 consumes 145 kW.
As a result, Nvidia's solution delivers 2.3 times higher power efficiency than Huawei's solution. Still, as noted above, if Huawei can deliver its CloudMatrix 384 in volumes, with proper software and support, the last thing its customers will care about is the power consumption of their systems.

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Veteran fund manager reboots Palantir stock price target
Veteran fund manager reboots Palantir stock price target

Yahoo

time19 minutes ago

  • Yahoo

Veteran fund manager reboots Palantir stock price target

Veteran fund manager reboots Palantir stock price target originally appeared on TheStreet. There's been a lot of debate surrounding artificial intelligence stocks this year. A boom in AI spending, particularly by hyperscalers ramping infrastructure to meet surging research and development of chatbots and agentic AI, led to eye-popping returns for companies like Palantir Technologies, which markets data analytics platforms. However, concern that spending could decelerate has picked up in 2025 because of worry over a tariffs-driven recession, causing many AI stocks like chip-maker Nvidia to the eventual impact of tariffs on recession remains a question mark, there's been little to suggest demand for Palantir's services is slipping. Solid first-quarter earnings results and optimism that trade deals could make tariffs manageable have helped Palantir shares rally 63% this year after a 340% surge in 2024. Palantir's resiliency isn't lost on long-time money manager Chris Versace. Versace, who first picked up shares last year, recently updated his price target as Palantir's stock challenges all-time highs. Investors' interest in Palantir stock swelled after OpenAI's ChatGPT became the fastest app to reach one million users when it was launched in December 2022. ChatGPT's success has spawned the development of rival large language models, including Google's Gemini, and a wave of interest in agentic AI programs that can augment, and in some cases, replace traditional activity is widespread across most industries. Banks are using AI to hedge risks, evaluate loans, and price products. Drugmakers are researching AI's ability to predict drug targets and improve clinical trial outcomes. Manufacturers are using it to boost production and quality. Retailers are using it to forecast demand, manage inventories, and curb theft. The U.S. military is even seeing if AI can be effective on the battlefield. The seemingly boundless use cases — and the ability to profit from them — have many companies and governments turning to Palantir's deep expertise in managing and protecting data to train and run new AI apps. Palantir got its start helping the U.S. government build counterterrorism systems. Its Gotham platform still assists governments in those efforts today. It also markets its Foundry platform to manage, interpret, and report data to large companies across enterprise and cloud networks. And its AI platform (AIP) is sold as a tool for developing AI chatbots and apps. Demand for that platform has been big. In the fourth quarter, Palantir closed a "record-setting number of deals," according to CEO Alex Karp. The momentum continued into the first quarter. Revenue rose 39% year-over year to $884 million. Meanwhile, Palantir's profit has continued to improve as sales have grown. In Q1, its net income was $214 million, translating into adjusted earnings per share of 13 cents. "Our revenue soared 55% year-over-year, while our U.S. commercial revenue expanded 71% year-over-year in the first quarter to surpass a one-billion-dollar annual run rate,' said Karp in Palantir's first-quarter earnings release. 'We are delivering the operating system for the modern enterprise in the era of AI." AI's rapid rise has opened Palantir's products to an increasingly new range of industries, allowing it to diversify its customer base. For example, Bolt Financial, an online checkout platform, recently partnered with Palantir to use AI tools to analyze customer behavior better. More Palantir: Palantir gets great news from the Pentagon Wall Street veteran doubles down on Palantir Palantir bull sends message after CEO joins Trump for Saudi visit The potential to ink more deals like this has caught portfolio manager Chris Versace's attention. "The result [of the Bolt deal] will be technology that can offer shoppers a customized checkout experience, embedded within retailers' sites and apps, and it is one that will extend to agentic checkout as well," wrote Versace on TheStreet Pro. "We see this as the latest expansion by Palantir into the commercial space, and we are likely to see more of this as AI flows through payment processing and digital shopping applications." Alongside Palantir's deeply embedded government contracts, growing relationships with enterprises should provide Palantir with cross-selling opportunities, further driving sales and profit growth, allowing for increased financial guidance. Palantir is guiding for full-year sales growth of 36%, and U.S. commercial revenue growth of 68%. The chances for Palantir growth to continue accelerating has Versace increasingly optimistic about its shares. As a result, he's increased his price target to $140 per share from $ fund manager reboots Palantir stock price target first appeared on TheStreet on Jun 8, 2025 This story was originally reported by TheStreet on Jun 8, 2025, where it first appeared. Sign in to access your portfolio

US-China trade talks to open in London as new disputes emerge
US-China trade talks to open in London as new disputes emerge

New York Post

timean hour ago

  • New York Post

US-China trade talks to open in London as new disputes emerge

US-China trade talks in London this week are expected to take up a series of fresh disputes that have buffeted relations, threatening a fragile truce over tariffs. Both sides agreed in Geneva last month to a 90-day suspension of most of the 100%-plus tariffs they had imposed on each other in an escalating trade war that had sparked fears of recession. Since then, the US and China have exchanged angry words over advanced semiconductors that power artificial intelligence, 'rare earths' that are vital to carmakers and other industries, and visas for Chinese students at American universities. Advertisement 3 President Trump spoke at length with Chinese leader Xi Jinping by phone last Thursday in an attempt to put relations back on track. REUTERS President Trump spoke at length with Chinese leader Xi Jinping by phone last Thursday in an attempt to put relations back on track. Trump announced on social media the next day that trade talks would be held on Monday in London. The latest frictions began just a day after the May 12 announcement of the Geneva agreement to 'pause' tariffs for 90 days. Advertisement The US Commerce Department issued guidance saying the use of Ascend AI chips from Huawei, a leading Chinese tech company, could violate US export controls. That's because the chips were likely developed with American technology despite restrictions on its export to China, the guidance said. The Chinese government wasn't pleased. One of its biggest beefs in recent years has been over US moves to limit the access of Chinese companies to technology, and in particular to equipment and processes needed to produce the most advanced semiconductors. 'The Chinese side urges the US side to immediately correct its erroneous practices,' a Commerce Ministry spokesperson said. US Commerce Secretary Howard Lutnick wasn't in Geneva but will join the talks in London. Analysts say that suggests at least a willingness on the US side to hear out China's concerns on export controls. Advertisement 3 US Commerce Secretary Howard Lutnick will take part in the talks in London. One area where China holds the upper hand is in the mining and processing of rare earths. They are crucial for not only autos but also a range of other products from robots to military equipment. The Chinese government started requiring producers to obtain a license to export seven rare earth elements in April. Resulting shortages sent automakers worldwide into a tizzy. As stockpiles ran down, some worried they would have to halt production. Trump, without mentioning rare earths specifically, took to social media to attack China. Advertisement 'The bad news is that China, perhaps not surprisingly to some, HAS TOTALLY VIOLATED ITS AGREEMENT WITH US,' Trump posted on May 30. 3 China dominates the mining and processing of rare earth minerals. REUTERS The Chinese government indicated Saturday that it is addressing the concerns, which have come from European companies as well. A Commerce Ministry statement said it had granted some approvals and 'will continue to strengthen the approval of applications that comply with regulations.' The scramble to resolve the rare earth issue shows that China has a strong card to play if it wants to strike back against tariffs or other measures. Student visas don't normally figure in trade talks, but a US announcement that it would begin revoking the visas of some Chinese students has emerged as another thorn in the relationship. China's Commerce Ministry raised the issue when asked last week about the accusation that it had violated the consensus reached in Geneva. It replied that the US had undermined the agreement by issuing export control guidelines for AI chips, stopping the sale of chip design software to China and saying it would revoke Chinese student visas.

Starmer Calls on Nvidia's Huang to Train Up Britons on AI
Starmer Calls on Nvidia's Huang to Train Up Britons on AI

Bloomberg

time2 hours ago

  • Bloomberg

Starmer Calls on Nvidia's Huang to Train Up Britons on AI

Keir Starmer will make an appearance alongside Nvidia Corp co-founder Jensen Huang on Monday, as the British prime minister puts technology and artificial intelligence at the heart of his government's plan to boost economic growth. The Labour leader will hold an in-conversation event in London with tech billionaire Huang to mark an agreement in which Nvidia helps the UK train more people in AI and expands research at universities and at the company's own AI lab in Bristol, west England.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store