Who Needs Big AI Models?

08-07-2025

Cerebras Systems CEO and Founder Andrew Feldman
The AI world continues to evolve rapidly, especially since the introduction of DeepSeek and its followers. Many have concluded that enterprises don't really need the large, expensive AI models touted by OpenAI, Meta, and Google, and are focusing instead on smaller models, such as DeepSeek V2-Lite with 2.4B parameters, or Llama 4 Scout and Maverick with 17B parameters, which can provide decent accuracy at a lower cost. It turns out that this is not the case for coders, or more accurately, the models that can and will replace many coders. Nor does the smaller-is-better mantra apply to reasoning or agentic AI, the next big thing.
AI code generators require large models that can handle a wider context window, capable of accommodating approximately 100,000 lines of code. Mixture of expert (MOE) models supporting agentic and reasoning AI is also large. But these massive models are typically quite expensive, costing around $10 to $15 per million output tokens on modern GPUs. Therein lies an opportunity for novel AI architectures to encroach on GPUs' territory.
Cerebras Systems Launches Big AI with Qwen3-235B
Cerebras Systems (a client of Cambrian-AI Research) has announced support for the large Qwen3-235B, supporting 131K context length (about 200–300 pages of text), four times what was previously available. At the RAISE Summit in Paris, Cerebras touted Alibaba's Qwen3-235B, which uses a highly efficient mixture-of-experts architecture to deliver exceptional compute efficiency. But the real news is that Cerebras can run the model at only $0.60 per million input tokens and per million output tokens—less than one-tenth the cost of comparable closed-source models. While many consider the Cerebras wafer-scale engine expensive, this data turns that perception on its head.
Agents are a use case that frequently requires very large models.
One question I frequently get is, if Cerebras is so fast, why don't they have more customers? One reason is that they have not supported large context windows and larger models. Those seeking to develop code, for example, do not want to break down the problem into smaller fragments to fit, say, a 32KB context. Now, that barrier to sales has evaporated.
'We're seeing huge demand from developers for frontier models with long context, especially for code generation,' said Cerebras Systems CEO and Founder Andrew Feldman. "Qwen3-235B on Cerebras is our first model that stands toe-to-toe with frontier models like Claude 4 and DeepSeek R1. And with full 131K context, developers can now use Cerebras on production-grade coding applications and get answers back in less than a second instead of waiting for minutes on GPUs.'
Cerebras is not just 30 times faster, it is 92% cheaper than GPUs.
Cerebras has quadrupled its context length support from 32K to 131K tokens—the maximum supported by Qwen3-235B. This expansion directly impacts the model's ability to reason over large codebases and complex documentation. While 32K context is sufficient for simple code generation use cases, 131K context enables the model to process dozens of files and tens of thousands of lines of code simultaneously, allowing for production-grade application development.
Cerebras is 15-100 times more affordable than GPUs when running Qwen3-235B
Qwen3-235B excels at tasks requiring deep logical reasoning, advanced mathematics, and code generation, thanks to its ability to switch between "thinking mode" (for high-complexity tasks) and "non-thinking mode" (for efficient, general-purpose dialogue). The 131K context length allows the model to ingest and reason over large codebases (tens of thousands of lines), supporting tasks such as code refactoring, documentation, and bug detection.
Cerebras also announced the further expansion of its ecosystem, with support from Amazon AWS, as well as DataRobot, Docker, Cline, and Notion. The addition of AWS is huge;
Cerebras has added AWS to its cloud portfolio.
Where is this heading?
Big AI has constantly been downsized and optimized, with orders of magnitude of performance gains, model sizes, and price reductions. This trend will undoubtedly continue, but will be constantly offset by increases in capabilities, accuracy, intelligence, and entirely new features across modalities. So, if you want last year's AI, you're in great shape, as it continues to get cheaper.
But if you want the latest features and functions, you will require the largest models and the longest input context length.
It's the Yin and Yang of AI.

Hashtags

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

The Verge

6 minutes ago

The Verge

OpenAI will update GPT-5's 'personality' after user backlash.

Posted Aug 13, 2025 at 1:38 AM UTC OpenAI will update GPT-5's 'personality' after user backlash. CEO Sam Altman wrote on X that OpenAI plans to update the model to 'feel warmer than the current personality but not as annoying (to most users) as GPT-4o.' He added, 'One learning for us from the past few days is we really just need to get to a world with more per-user customization of model personality.' Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates. Hayden Field Posts from this author will be added to your daily email digest and your homepage feed. See All by Hayden Field Posts from this topic will be added to your daily email digest and your homepage feed. See All AI Posts from this topic will be added to your daily email digest and your homepage feed. See All News Posts from this topic will be added to your daily email digest and your homepage feed. See All OpenAI

Sam Altman, OpenAI will reportedly back a startup that takes on Musk's Neuralink

Yahoo

24 minutes ago

Yahoo

Sam Altman, OpenAI will reportedly back a startup that takes on Musk's Neuralink

Sam Altman is in the process of co-founding a new brain-to-computer interface startup called Merge Labs and raising funds for it with the capital possibly coming largely from OpenAI's ventures team, unnamed sources told the Financial Times. The startup is expected to be valued at $850 million. A source familiar with the deal tells TechCrunch that talks are still early and OpenAI has not yet committed to participation, so terms could change. Merge Labs is also reportedly working with Alex Blania, who runs Tools for Humanity (formerly World) — Altman's eye-scanning digital ID project that 'allows anyone to verify their humanness,' as the company describes. Merge Labs will compete with Elon Musk's Neuralink, which is developing computer interface chips designed to be implanted in the brain. Musk founded Neuralink in 2016 (although its existence wasn't known until 2017) and the company has made serious progress. Neuralink is currently in trials with people who suffer from severe paralysis. It aims to allow them to control devices with their thoughts. It raised a $600 million Series E at a $9 billion valuation in June. Neuralink (and perhaps, Merge Labs) could revolutionize how humans interact with technology. Some might even say their tech could take humanity toward 'the singularity.' Long before Silicon Valley became obsessed with the concept of artificial general intelligence (AGI), it was enamored with 'the singularity.' Musk has used the term to describe a time when AI surpasses human intelligence. The more classic definition (after a 1960's novella of the same name by Dino Buzzati) means the merging of tech with humans. Altman blogged about 'The Merge' in 2017. 'Although the merge has already begun, it's going to get a lot weirder. We will be the first species ever to design our own descendants,' he postulated at the time, citing research work he saw at OpenAI, where Musk was still a co-founder. Musk left OpenAI in 2018 and the relationship between the two tech leaders has since disintegrated. Just this week, Altman and Musk were bickering on X after Altman accused Musk of manipulating X and Musk called Altman a liar. We'll have to wait and see when and if Merge Labs becomes formally announced. But it stands to reason that Altman wasn't going to let Musk work on something as important as the singularity without a challenger. OpenAI declined comment.

Elon Musk accuses App Store of favoring OpenAI

Yahoo

39 minutes ago

Yahoo

Elon Musk accuses App Store of favoring OpenAI

Elon Musk has taken his feud against OpenAI to the App Store, accusing Apple of favoring ChatGPT in the digital shop and vowing legal action. "Apple is behaving in a manner that makes it impossible for any AI company besides OpenAI to reach #1 in the App Store, which is an unequivocal antitrust violation," Musk said in a post on his social network X on Monday, without providing evidence to back his claim. "xAI will take immediate legal action," he added, referencing his own artificial intelligence company. X users responded by pointing out that DeepSeek AI out of China hit the top spot in the App Store early this year, and Perplexity AI recently ranked number one in the App Store in India. DeepSeek and Perplexity compete with OpenAI and Musk's startup xAI. Both OpenAI and xAI released new versions of their AI assistants, ChatGPT and Grok, in the past week. App Store rankings on Tuesday listed ChatGPT as the top free iPhone app with Grok in fifth place. Apple did not immediately respond to a request for comment. Factors going into App Store rankings include user engagement, reviews, and the number of downloads. OpenAI and Apple in June of last year announced an alliance to enhance iPhones and other devices with ChatGPT features. ChatGPT-5 rolled out free to the nearly 700 million people who use it weekly, OpenAI said in a briefing with journalists last week. Tech industry rivals Amazon, Google, Meta, Microsoft and xAI have been pouring billions of dollars into artificial intelligence since the blockbuster launch of the first version of ChatGPT in late 2022. Chinese startup DeepSeek shook up the AI sector early this year with a model that delivers high performance using less costly chips. OpenAI in April of this year filed counterclaims against multi-billionaire Musk, accusing its former co-founder of waging a "relentless campaign" to damage the organization after it achieved success without him. In legal documents filed at the time in northern California federal court, OpenAI alleged Musk became hostile toward the company after abandoning it years before its breakthrough achievements with ChatGPT. The lawsuit was another round in a bitter feud between the generative AI (genAI) start-up and the world's richest man, who sued OpenAI last year, accusing the company of betraying its founding mission. In its countersuit, the company alleged Musk "made it his project to take down OpenAI, and to build a direct competitor that would seize the technological lead -- not for humanity but for Elon Musk." Musk founded his own genAI startup, xAI, in 2023 to compete with OpenAI and the other major AI players. gc/bjt