logo
Pushing The Limits Of Modern LLMs With A Dinner Plate?

Pushing The Limits Of Modern LLMs With A Dinner Plate?

Forbes12-05-2025

Cerebras Systems
If you've been reading this blog, you probably know a bit about how AI works – large language models parse enormous amounts of data into tiny manageable pieces, and then use those to respond to user-driven stimuli or events.
In general, we know that AI is big, and getting bigger. It's fast, and getting faster.
Specifically, though, not everyone's familiar with some of the newest hardware and software approaches in the industry, and how they promote better results. People are working hard on revealing more of the inherent power in LLM technologies. And things are moving at a rapid clip.
One of these is the Cerebras WSE or Wafer Scale Engine – this is a massive processor that can power previously unimaginable AI capabilities.
First of all, you wouldn't call it a microprocessor anymore. It's famously the size of a dinner plate – 8.5 x 8.5 inches. It has hundreds of thousands of cores, and a context capability that's staggering.
But let me start with some basic terminology that you can hear in a presentation by Morgan Rockett, previously an MIT student, talking about the basics when it comes to evaluating LLM output.
LLMs are neural networks. They use a process of tokenization, where a token is one small bit of data that gets put into an overall context for the machine programming question.
Then there is context – the extent to which the program can look back at the previous tokens, and tie them into the greater picture.
There's also inference – the way that the computer thinks in real time when you give it a question or produce a response.
Another term that Rockett goes over is rate limits, where if you don't own the model, you're going to have to put up with thresholds imposed by that model's operators.
What Rockett reveals as he goes along explaining the hardware behind these systems is that he is a fellow with Cerebras, which is pioneering that massive chip.
Looking at common hardware setups, he goes over four systems – Nvidia GPUs, Google TPUs, Grok LPUs (language processing units,) and the Cerebras WSE.
'There's really nothing like it on the market,' he says of the WSU product noting that you can get big context and fast inference, if you have the right hardware and the right technique. 'In terms of speed benchmarks, Cerebras is an up and coming chip company. They have 2500 tokens per second, which is extremely fast. It's almost instant response. The entire page of text will get generated, and it's too fast to read.'
He noted that Grok is currently in second place with around 1600 tokens per second.
The approach showcased in this presentation was basically a selection of given chunks of a large file, and the summarization of the contents of that file.
Noting that really big files are too big for LLMs to manage, Rockett presents three approaches – Log2, square root, and double square root – all of which involve taking a sampling of chunks to get a cohesive result without overloading your model, using a 'funnel' design.
In a demo, he showed a 4 to 5 second inference model on a data set of 4 GB, the equivalent, he said, of a 10-foot-tall stack of paper, or alternately, 4 million tokens.
The data he chose to use was the total archive of our available information around the transformational event of the assassination of JFK in the 60s.
Rockett showed the model using his approach to summarize, working with virtually unlimited RAM, where tokenization was the major time burden.
With slotted input techniques, he said, you could get around rate limits, and tokenization can conceivably be worked out.
Check out the video for a summary on the archive, going over a lot of the CIA's clandestine activities in that era, and tying in the Bay of Pigs event and more.
Going back to practical uses for the Cerebras processor, Rockett mentioned legal, government and the trading world, where quick information is paramount.
I wanted more concrete examples, so I asked ChatGPT. It returned numerous interesting use cases for this hardware, including G42, an AI and cloud company in the United Arab Emirates, as well as the Mayo Clinic, various pharmaceutical companies, and the Lawrence Livermore National Laboratory (here's a story I did including Lawrence Livermore's nuclear project).
Then I asked a different question:
'Can you eat dinner off of a Cerebras WSE?'
'Physically?' ChatGPT replied. 'Yes, but you'd be committing both a financial and technological atrocity … the Cerebras Wafer-Scale Engine (WSE) is the largest chip ever built—Using it as a plate would be like eating spaghetti off the Rosetta Stone—technically possible, but deeply absurd.'
It gave me these three prime reasons not to attempt something so foolhardy (I attached verbatim):
'In short: You could eat dinner off of it,' ChatGPT said, 'Once. Then you'd have no chip, no dinner, and no job. Using it as a plate would be like eating spaghetti off the Rosetta Stone, technically possible, but deeply absurd.'
Touché, ChatGPT. Touché.
That's a little about one of the most fascinating pieces of hardware out there, and where it fits into the equation of context + inference. When we supercharge these systems, we see what used to take a long time happening pretty much in real time. That's a real eye-opener.

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

76% of workers have seen senior colleagues resist new tech
76% of workers have seen senior colleagues resist new tech

Yahoo

timean hour ago

  • Yahoo

76% of workers have seen senior colleagues resist new tech

This story was originally published on To receive daily news and insights, subscribe to our free daily newsletter. If you're looking for cues as to how your employees will adapt to new technology tools, consider the age profile of the organization's workforce. In a survey of 500 U.S. professionals across various industries conducted by Yooz, a provider of purchase-to-pay automation, more than three-quarters (76%) of all workers witnessed senior colleagues push back against new technology. Furthermore, 55% of millennials (those born 1981 to 1996) said they're 'excited and eager' to try new tools, while only 22% of baby boomers (those born 1946 and 1964) said the same. More than a third (35%) of the baby boomers said they feel cautious or annoyed, or prefer sticking to the old system, when new technology is plugged in. Not a single Gen Z survey participant (those born 1997 and 2012) selected that response. At the same time, about a quarter of Gen Z employees have refused to use a new workplace tool, more than any of the other generations, which Yooz characterized as Gen Z not being 'afraid to say no.' About AI specifically, 35% of Gen Z workers said they 'love' new tools, versus 13% of Boomers. Still, 40% of employees overall said they find AI helpful but unreliable. 'The takeaway: skepticism is still widespread, but younger employees see clear value,' Yooz wrote in its survey report. The report said organizations 'need to manage rollouts carefully: leverage the enthusiasm of younger adopters to build momentum, but also address the concerns of veteran staff who may need more reassurance to get on board.' The key to winning over anyone reluctant to embrace AI is building trust through real-world use cases and support, showcasing quick wins such as an AI tool that saves everyone time on a tedious task, Yooz wrote. Among employees, the most commonly cited need with regard to new technologies is better training on AI tools. More than half (52%) of those polled said their company takes a 'learn-as-you-go' approach, providing only some basic training or documentation. Relatively few employees said their employer provides 'a lot' of training on new tools. This embedded content is not available in your region. And, almost half (48%) of the employees said better training for all would be among the most effective ways to help a company adopt new technology more effectively. The research also delved into the question of who should drive decisions to implement new workplace technology. While younger employees are craving clear direction from the C-suite, at the same time, there is 'a call for more bottom-up input in tech decisions,' according to the survey report. More than a third (35%) of Gen Z respondents said new workplace tools should be chosen by leadership with input from younger employees. Additionally, more than a quarter of Millennial and Gen Z respondents said adoption would improve if leadership embraced change faster and more visibly. A sizable minority (28%) of Gen Z employees said they feel older employees are actively holding back innovation at their company. Yooz advised a collaborative approach to the issue, pairing 'tech-savvy younger employees with veteran staff to share knowledge and encourage cross-generational support during rollouts.' Yooz partnered on the research with Pollfish, a third-party survey platform, and it was conducted in February 2025. Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

OpenAI CEO Sam Altman says AI is ready for entry-level jobs—but unbothered Gen Z have made it their new work friend
OpenAI CEO Sam Altman says AI is ready for entry-level jobs—but unbothered Gen Z have made it their new work friend

Yahoo

timean hour ago

  • Yahoo

OpenAI CEO Sam Altman says AI is ready for entry-level jobs—but unbothered Gen Z have made it their new work friend

Billionaire OpenAI CEO Sam Altman reveals that AI can already perform tasks of junior-level employees—and the ability for it to work days at a time is just around the corner. With fellow tech leaders like Nvidia's Jensen Huang saying those who fail to embrace the technology will be replaced, some Gen Zers are catching on. If you're in desperate need of an intern, there's good news: there may soon be an abundance of them. But they might not be able to fetch you a coffee. OpenAI CEO Sam Altman admitted this week that AI agents—AI-powered systems that can complete job-related tasks with other software tools—can now effectively do the same work as entry-level employees. 'Today (AI) is like an intern that can work for a couple of hours but at some point it'll be like an experienced software engineer that can work for a couple of days,' Altman said on a panel with Snowflake CEO Sridhar Ramaswamy. In the coming months, AI agents will only get exponentially better, Altman said—to the point where their skills are just as good as an experienced software engineer. They're anticipated to operate continuously for days on end, without pause. 'I would bet next year that in some limited cases, at least in some small ways, we start to see agents that can help us discover new knowledge, or can figure out solutions to business problems that are very non-trivial,' the 40-year-old AI CEO added. Fortune reached out to Altman for comment. While this may seem like a grim reality for some workers, the future of human employees' success may depend on following the advice of tech CEOs like Nvidia's Jensen Huang. He predicted those who fail to embrace AI might be the next employee to get the pink slip. 'You're not going to lose your job to an AI, but you're going to lose your job to someone who uses AI,' he said at the Milken Institute's Global Conference last month. Generative AI may be eclipsing the skills of entry-level workers—like conducting research or developing PowerPoints. Some Gen Z have already seen the writing on the wall, and begun embracing the technology more than other age groups. About 51% of Gen Z now view generative AI just like a co-worker or as a friend, according to a recent survey from That's compared to just over 40% of millennials and 35% of Gen X or baby boomers who feel the same way. Altman has gone even further to say that many young people (including millennials) are turning to AI for far more than just internet sleuthing: '(It's a) gross oversimplification, but like older people use ChatGPT as a Google replacement. Maybe people in their 20s and 30s use it as like a life advisor, and then, like people in college use it as an operating system,' Altman said at Sequoia Capital's AI Ascent event earlier this month. 'And there's this other thing where (young people) don't really make life decisions without asking ChatGPT what they should do,' he added. Not all tech leaders have been as upbeat about the future, and have instead used their public appearances to highlight fears of an AI-driven job market reckoning. According to Anthropic CEO Dario Amodei, AI could eliminate half of all entry-level white-collar jobs within five years. Unemployment could skyrocket to 10% to 20%, he told Axios. To put that into context, it's currently at around 4%. Researchers at his company added that the next decade will be 'pretty terrible' for humans as desk jobs are automated, they told tech podcaster Dwarkesh Patel in an interview. This comes as the latest model of Claude—Anthropic's generative AI—can now reportedly code autonomously for nearly seven hours. This story was originally featured on

Google CEO Sundar Pichai Is 'Vibe Coding' a Website for Fun
Google CEO Sundar Pichai Is 'Vibe Coding' a Website for Fun

Entrepreneur

timean hour ago

  • Entrepreneur

Google CEO Sundar Pichai Is 'Vibe Coding' a Website for Fun

Vibe coding is the process of prompting AI to write code, instead of doing it manually. Google and Alphabet CEO Sundar Pichai disclosed that he has been "vibe coding," or using AI to code for him through prompts, to build a webpage. Pichai said on Wednesday at Bloomberg Tech in San Francisco that he had been experimenting with AI coding assistants Cursor and Replit, both of which are advertised as able to create code from text prompts, to build a new webpage. Related: Here's How Much a Typical Google Employee Makes in a Year "I've just been messing around — either with Cursor or I vibe coded with Replit — trying to build a custom webpage with all the sources of information I wanted in one place," Pichai said, per Business Insider. Google CEO Sundar Pichai. Photographer: David Paul Morris/Bloomberg via Getty Images Pichai said that he had "partially" completed the webpage, and that coding had "come a long way" from its early days. Vibe coding is a term coined by OpenAI co-founder Andrej Karpathy. In a post on X in February, Karpathy described how AI tools are getting good enough that software developers can "forget that the code even exists." Instead, they can ask for AI to code on their behalf and create a project or web app without writing a line of code themselves. There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper… — Andrej Karpathy (@karpathy) February 2, 2025 The rise of vibe coding has led AI coding assistants to explode in popularity. One AI coding tool, Cursor, became the fastest-growing software app to reach $100 million in annual revenue in January. Almost all of Cursor's revenue comes from 360,000 individual subscribers, not big enterprises. However, that balance could change: As of earlier this week, Amazon is reportedly in talks to adopt Cursor for its employees. Another coding tool, Replit, says it has enabled users to make more than two million apps in six months. The company has 34 million global users as of November. Related: This AI Startup Spent $0 on Marketing. Its Revenue Just Hit $200 Million. Noncoders are using vibe coding to bring their ideas to life. Lenard Flören, a 28-year-old art director with no prior coding experience, told NBC News last month that he used AI tools to vibe code a personalized workout tracking app. Harvard University neuroscience student, Rishab Jain, 20, told the outlet that he used Replit to vibe code an app that translates ancient texts into English. Instead of downloading someone else's app and paying a subscription fee, "now you can just make it," Jain said. Popular vibe coding tools offer a free entry point into vibe coding, as well as subscription plans. Replit has a free tier, a $20 a month core level with expanded capabilities, such as unlimited private and public apps, and a $35 per user, per month teams subscription. Cursor also has a free tier, a $20 per month pro level, and a $40 per user, per month, business subscription. Despite the existence of vibe coding, Pichai still thinks that human software engineers are necessary. At Bloomberg Tech on Wednesday, Pichai said that Google will keep hiring human engineers and growing its engineering workforce "even into next year" because a bigger workforce "allows us to do more." "I just view this [AI] as making engineers dramatically more productive," he said. Alphabet is the fifth most valuable company in the world with a market cap of $2 trillion.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store