logo
Cerebras Beats NVIDIA Blackwell in Llama 4 Maverick Inference

Cerebras Beats NVIDIA Blackwell in Llama 4 Maverick Inference

Business Wirea day ago

SUNNYVALE, Calif.--(BUSINESS WIRE)--Last week, Nvidia announced that 8 Blackwell GPUs in a DGX B200 could demonstrate 1,000 tokens per second (TPS) per user on Meta's Llama 4 Maverick. Today, the same independent benchmark firm Artificial Analysis measured Cerebras at more than 2,500 TPS/user, more than doubling the performance of Nvidia's flagship solution.
'Cerebras has beaten the Llama 4 Maverick inference speed record set by NVIDIA last week. Artificial Analysis benchmarked Cerebras' Llama 4 Maverick endpoint at 2,522 t/s compared to NVIDIA Blackwell's 1,038 t/s for the same model." - Artificial Analysis
Share
'Cerebras has beaten the Llama 4 Maverick inference speed record set by NVIDIA last week,' said Micah Hill-Smith, Co-Founder and CEO of Artificial Analysis. 'Artificial Analysis has benchmarked Cerebras' Llama 4 Maverick endpoint at 2,522 tokens per second, compared to NVIDIA Blackwell's 1,038 tokens per second for the same model. We've tested dozens of vendors, and Cerebras is the only inference solution that outperforms Blackwell for Meta's flagship model.'
With today's results, Cerebras has set a world record for LLM inference speed on the 400B parameter Llama 4 Maverick model, the largest and most powerful in the Llama 4 family. Artificial Analysis tested multiple other vendors, and the results were as follows: SambaNova 794 t/s, Amazon 290 t/s, Groq 549 t/s, Google 125 t/s, and Microsoft Azure 54 t/s.
Andrew Feldman, CEO of Cerebras Systems, said, 'The most important AI applications being deployed in enterprise today—agents, code generation, and complex reasoning—are bottlenecked by inference latency. These use cases often involve multi-step chains of thought or large-scale retrieval and planning, with generation speeds as low as 100 tokens per second on GPUs, causing wait times of minutes and making production deployment impractical. Cerebras has led the charge in redefining inference performance across models like Llama, DeepSeek, and Qwen, regularly delivering over 2,500 TPS/user.'
With its world record performance, Cerebras is the optimal solution for Llama 4 in any deployment scenario. Not only is Cerebras Inference the first and only API to break the 2,500 TPS/user milestone on this model, but unlike the Nvidia Blackwell used in the Artificial Analysis benchmark, the Cerebras hardware and API are available now. Nvidia used custom software optimizations that are not available to most users. Interestingly, none of the Nvidia's inference providers offer a service at Nvidia's published performance. This suggests that in order to achieve 1000 TPS/user, Nvidia was forced to reduce throughput by going to batch size 1 or 2, leaving the GPUs at less than 1% utilization. Cerebras, on the other hand, achieved this record-breaking performance without any special kernel optimizations, and it will be available to everyone through Meta's API service coming soon.
For cutting-edge AI applications such as reasoning, voice, and agentic workflows, speed is paramount. These AI applications gain intelligence by processing more tokens during the inference process. This can also make them slow and force customers to wait. And when customers are forced to wait, they leave and go to competitors who provide answers faster—a finding Google showed with search more than a decade ago.
With record-breaking performance, Cerebras hardware and resulting API service is the best choice for developers and enterprise AI users around the world.
For more information, please visit https://www.cerebras.ai/.

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Can Jones beat out Richardson for Colts' QB job?
Can Jones beat out Richardson for Colts' QB job?

Yahoo

time22 minutes ago

  • Yahoo

Can Jones beat out Richardson for Colts' QB job?

Can Arthur Smith Be Steelers QB Whisperer? | Steelers Morning Rush Welcome to Steelers Morning Rush, our new daily short-form podcast with Alan Saunders, giving a longer perspective on a single news topic surrounding the Pittsburgh Steelers or the National Football League. Today, it's offensive coordinator Arthur Smith, and his ability to mentor the team's next quarterback. The Steelers have a plan to trade up to draft a quarterback in the first round of the 2026 NFL Draft, finally addressing the position after several missteps in the aftermath of the retirement of Ben Roethlisberger. But is Smith the right guy for the job to develop a rookie quarterback? It's not anything he's ever done before in his two seasons as Tennessee Titans offensive coordinator and three as Atlanta Falcons head coach, but there are some reasons to think that Smith will be capable of molding a talented passer, especially his success in resurrecting the careers of Ryan Tannehill and Justin Fields. So can Smith be the Steelers' quarterback whisperer? Alan breaks it down. #steelers #herewego #nfl CONNECT WITH STEELERS NOW: Steelers Now: SN on Twitter: SN on FB: SN on Insta: 8:05 Now Playing Paused Ad Playing

Could IonQ Become the Next Nvidia?
Could IonQ Become the Next Nvidia?

Yahoo

time22 minutes ago

  • Yahoo

Could IonQ Become the Next Nvidia?

IonQ is an innovator in the high-growth area of quantum computing. The company has grown revenue by selling hardware and services to those aiming to win in this potentially game-changing field. 10 stocks we like better than IonQ › Any investor would love to get in on the next Nvidia (NASDAQ: NVDA). This tech giant has led the artificial intelligence (AI) race, developing a dominant and likely lasting position in the AI chip market, and that's translated into spectacular earnings and stock price performance. Today's biggest movers and shakers in the development of AI platforms -- from Alphabet to Microsoft -- are among Nvidia's major customers. And when Nvidia announces a new chip, these giants and many others rush to be the first to buy it. So, it's fair to say Nvidia has played and will continue to play a key role in this billion-dollar -- and soon to be, if analysts' predictions are right, trillion-dollar -- AI boom. There's still plenty of room to get in on the Nvidia growth story, but investors also are looking to another innovative market that could result in a similar boom, and that's quantum computing. In fact, quantum company IonQ (NYSE: IONQ), which has seen its stock price soar more than 400% over the past year, recently suggested it could become the Nvidia of quantum computing. Can this quantum player potentially follow in AI superstar Nvidia's footsteps? AI involves the training of large language models to solve complex problems. Eventually, software with greater and greater capabilities in these areas can be set to work in the form of AI agents that will make companies more efficient and help them save money. And AI can be applied in many ways to specifically advance certain industries -- for example, in healthcare, it can find better drug candidate more quickly. This AI revolution is happening now, with technology that supports those goals. Quantum computing is a complementary but different technology. It uses quantum mechanics to solve problems that today's computers are unable to handle. Companies in the industry have various ways of advancing the technology, but they each involve using qubits to store and process data. Qubits -- as opposed to the bits used in traditional computing -- can represent more than one state: a 0, 1, or any combination of the two. This supercharges the computing process, resulting in speed and extraordinary capabilities. In the case of IonQ, the company captures ions to use as qubits, cools them, then with lasers guides the calculation process. Quantum computing still is in the early stages of development, and IonQ even said in its 2023 annual report that its ability to grow revenue and reach profitability "will depend heavily on the successful development and further commercialization of our quantum computing systems." Still, IonQ has made significant progress over the years and currently generates revenue through the sales of hardware and related services, and the company also makes its quantum systems available to customers through Amazon's Amazon Web Services (AWS), Microsoft Azure, and Alphabet's Google Cloud. Finally, IonQ sells quantum computing consulting services, helping businesses apply the technology to their needs. All this has helped IonQ's revenue to soar over time, but it's gotten further and further away from profitability as it invests in this potentially game-changing technology. This isn't a surprising pattern, though, for a company involved in a technology at this stage of its development story. Now, let's consider the Nvidia comparison. IonQ's chief executive officer, Niccolo de Masi, in a recent Barron's interview, suggested the company will be the Nvidia of the quantum computing market, the leader that others will follow. "I believe IonQ will be the Nvidia player. There will be other people that copy us and follow us; they have always copied and followed us," de Masi said. We can draw two key parallels. It's true that, like Nvidia, IonQ is focused on the full stack of software, offering customers everything they may need along their path in this technology. So IonQ is setting itself up for a dominant position in the market. IonQ also is the biggest pure-play quantum company by market value, now at more than $11 billion. Investors have piled into IonQ on optimism about its technology and growth. Similarly, Nvidia's $3.3 trillion market cap sets it well ahead of chip rivals such as Advanced Micro Devices and Intel. Now let's consider differences. While IonQ might become just as successful as Nvidia down the road, it's not traveling an identical path. Nvidia launched its initial public offering in 1999 and went on to immediately report not only revenue growth but increasing profitability. Prior to the AI boom, Nvidia served the video gaming market with its graphics processing units (GPUs), so the company had a solid main business before branching out into AI. IonQ, which launched in 2015 and went public in 2021, started out specializing in a new and complicated technology, so its route to profitability may be longer and riskier. All of this means that, IonQ, by going all in on a given technology -- in this case quantum computing -- resembles Nvidia. And aggressive investors aiming to get in now on potential winners should consider picking up a few shares of this innovator. But, considering that quantum computing is IonQ's focus, and this technology isn't yet fully developed, it's too early to predict with 100% certainty that this quantum player will follow in Nvidia's footsteps, dominating in its market and delivering explosive growth. Before you buy stock in IonQ, consider this: The Motley Fool Stock Advisor analyst team just identified what they believe are the for investors to buy now… and IonQ wasn't one of them. The 10 stocks that made the cut could produce monster returns in the coming years. Consider when Netflix made this list on December 17, 2004... if you invested $1,000 at the time of our recommendation, you'd have $653,389!* Or when Nvidia made this list on April 15, 2005... if you invested $1,000 at the time of our recommendation, you'd have $830,492!* Now, it's worth noting Stock Advisor's total average return is 982% — a market-crushing outperformance compared to 171% for the S&P 500. Don't miss out on the latest top 10 list, available when you join . See the 10 stocks » *Stock Advisor returns as of May 19, 2025 Suzanne Frey, an executive at Alphabet, is a member of The Motley Fool's board of directors. John Mackey, former CEO of Whole Foods Market, an Amazon subsidiary, is a member of The Motley Fool's board of directors. Adria Cimino has positions in Amazon. The Motley Fool has positions in and recommends Advanced Micro Devices, Alphabet, Amazon, Intel, Microsoft, and Nvidia. The Motley Fool recommends the following options: long January 2026 $395 calls on Microsoft, short August 2025 $24 calls on Intel, and short January 2026 $405 calls on Microsoft. The Motley Fool has a disclosure policy. Could IonQ Become the Next Nvidia? was originally published by The Motley Fool Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

How Good Can Mason Rudolph Be?
How Good Can Mason Rudolph Be?

Yahoo

time22 minutes ago

  • Yahoo

How Good Can Mason Rudolph Be?

Can Arthur Smith Be Steelers QB Whisperer? | Steelers Morning Rush Welcome to Steelers Morning Rush, our new daily short-form podcast with Alan Saunders, giving a longer perspective on a single news topic surrounding the Pittsburgh Steelers or the National Football League. Today, it's offensive coordinator Arthur Smith, and his ability to mentor the team's next quarterback. The Steelers have a plan to trade up to draft a quarterback in the first round of the 2026 NFL Draft, finally addressing the position after several missteps in the aftermath of the retirement of Ben Roethlisberger. But is Smith the right guy for the job to develop a rookie quarterback? It's not anything he's ever done before in his two seasons as Tennessee Titans offensive coordinator and three as Atlanta Falcons head coach, but there are some reasons to think that Smith will be capable of molding a talented passer, especially his success in resurrecting the careers of Ryan Tannehill and Justin Fields. So can Smith be the Steelers' quarterback whisperer? Alan breaks it down. #steelers #herewego #nfl CONNECT WITH STEELERS NOW: Steelers Now: SN on Twitter: SN on FB: SN on Insta: 8:05 Now Playing Paused Ad Playing

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store