
GPT-5: Has AI just plateaued?
OpenAI CEO Sam Altman says GPT-5 is PhD-level general intelligence but that's not clearly the case. Photo: Aflo Co Ltd / Alamy
OpenAI claims that its new flagship model, GPT-5, marks 'a significant step along the path to AGI' – that is, the artificial general intelligence that AI bosses and self-proclaimed experts often claim is around the corner.
According to OpenAI's own definition, AGI would be 'a highly autonomous system that outperforms humans at most economically valuable work.' Setting aside whether this is something humanity should be striving for, OpenAI CEO Sam Altman's arguments for GPT-5 being a 'significant step' in this direction sound remarkably unspectacular.
He claims GPT-5 is better at writing computer code than its predecessors. It is said to 'hallucinate' a bit less, and is a bit better at following instructions – especially when they require following multiple steps and using other software. The model is also apparently safer and less 'sycophantic', because it will not deceive the user or provide potentially harmful information just to please them.
Altman does say that 'GPT-5 is the first time that it really feels like talking to an expert in any topic, like a PhD-level expert.' Yet it still doesn't have a clue about whether anything it says is accurate, as you can see from its attempt below to draw a map of North America.
Sam Altman: With GPT-5, you'll have a PhD-level expert in any area you needMe: Draw a map of North America, highlighting countries, states, and capitalsGPT 5:
*Sam Altman forgot to mention that the PhD-level expert used ChatGPT to cheat on all their geography classes… pic.twitter.com/9L9VodXll1 — Luiza Jarovsky, PhD (@LuizaJarovsky) August 10, 2025
It also cannot learn from its own experience, or achieve more than 42% accuracy on a challenging benchmark like 'Humanity's Last Exam', which contains hard questions on all kinds of scientific (and other) subject matter. This is slightly below the 44% that Grok 4, the model recently released by Elon Musk's xAI, is said to have achieved.
The main technical innovation behind GPT-5 seems to be the introduction of a 'router'. This decides which model of GPT to delegate to when asked a question, essentially asking itself how much effort to invest in computing its answers (then improving over time by learning from feedback about its previous choices).
The options for delegation include the previous leading models of GPT and also a new 'deeper reasoning' model called GPT-5 Thinking. It's not clear what this new model actually is. OpenAI isn't saying it is underpinned by any new algorithms or trained on any new data (since all available data was pretty much being used already).
One might therefore speculate that this model is really just another way of controlling existing models with repeated queries and pushing them to work harder until it produces better results.
It was back in 2017 when researchers at Google found out that a new type of AI architecture was capable of capturing tremendously complex patterns within long sequences of words that underpin the structure of human language.
By training these so-called large language models (LLMs) on large amounts of text, they could respond to prompts from a user by mapping a sequence of words to their most likely continuation in accordance with the patterns present in the dataset. This approach to mimicking human intelligence became better and better as LLMs were trained on larger and larger amounts of data – leading to systems like ChatGPT.
Ultimately, these models just encode a humongous table of stimuli and responses. A user prompt is the stimulus, and the model might just as well look it up in a table to determine the best response. Considering how simple this idea seems, it's astounding that LLMs have eclipsed the capabilities of many other AI systems – if not in terms of accuracy and reliability, certainly in terms of flexibility and usability.
The jury may still be out on whether these systems could ever be capable of true reasoning, or understanding the world in ways similar to ours, or keeping track of their experiences to refine their behaviour correctly – all arguably necessary ingredients of AGI.
In the meantime, an industry of AI software companies has sprung up that focuses on 'taming' general purpose LLMs to be more reliable and predictable for specific use cases.
Having studied how to write the most effective prompts, their software might prompt a model multiple times, or use numerous LLMs, adjusting the instructions until it gets the desired result. In some cases, they might 'fine-tune' an LLM with small-scale add-ons to make them more effective.
OpenAI's new router is in the same vein, except it's built into GPT-5. If this move succeeds, the engineers of companies further down the AI supply chain will be needed less and less. GPT-5 would also be cheaper to users than its LLM competitors because it would be more useful without these embellishments.
At the same time, this may well be an admission that we have reached a point where LLMs cannot be improved much further to deliver on the promise of AGI. If so, it will vindicate those scientists and industry experts who have been arguing for a while that it won't be possible to overcome the current limitations in AI without moving beyond LLM architectures.
OpenAI's new emphasis on routing also harks back to the 'meta reasoning' that gained prominence in AI in the 1990s, based on the idea of 'reasoning about reasoning.' Imagine, for example, you were trying to calculate an optimal travel route on a complex map.
Heading off in the right direction is easy, but every time you consider another 100 alternatives for the remainder of the route, you will likely only get an improvement of 5% on your previous best option. At every point of the journey, the question is how much more thinking it's worth doing.
This kind of reasoning is important for dealing with complex tasks by breaking them down into smaller problems that can be solved with more specialized components. This was the predominant paradigm in AI until the focus shifted to general-purpose LLMs. No more gold rush? Photo: JarTee via The Conversation
It is possible that the release of GPT-5 marks a shift in the evolution of AI which, even if it is not a return to this approach, might usher in the end of creating ever more complicated models whose thought processes are impossible for anyone to understand.
Whether that could put us on a path toward AGI is hard to say. But it might create an opportunity to move towards creating AIs we can control using rigorous engineering methods.
And it might help us remember that the original vision of AI was not only to replicate human intelligence, but also to better understand it.
Michael Rovatsos is professor of artificial intelligence, University of Edinburgh
This article is republished from The Conversation under a Creative Commons license. Read the original article.

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles


South China Morning Post
8 hours ago
- South China Morning Post
Chinese start-up joins NetDragon-owned Cherrypicks to push AI solutions overseas
Beijing -based Zhongke WengAI, whose services are used by various Chinese ministries and state media outlets, will also jointly develop with Cherrypicks – owned by Hong Kong -listed NetDragon Websoft – enterprise AI solutions for industries such as finance and healthcare, the partners said in a statement on Friday. This collaboration 'exemplifies the convergence of China's AI 'go-global' strategy with Hong Kong's innovation strengths', said Simon Leung Lim-kin , vice-chairman at NetDragon. He also pointed out that the strategic partnership would help 'further cement Hong Kong's position as an international innovation hub '. Shares of NetDragon closed unchanged at HK$11.61 on Friday. NetDragon Websoft vice-chairman Simon Leung Lim-kin. Photo: Jonathan Wong The partnership reflects efforts by Chinese AI firms to expand the reach of their operations beyond the mainland, while bolstering Hong Kong's campaign to reposition itself as an international innovation and technology hub.


South China Morning Post
a day ago
- South China Morning Post
China's leading supplier of 12-inch silicon wafers given greenlight for Shanghai IPO
Eswin, China's largest supplier of 12-inch silicon wafers, has received approval for an initial public offering (IPO) in Shanghai, the latest sign of increased efforts by Chinese chip companies to raise funds amid the rapid development of artificial intelligence (AI) and the country's tech self-sufficiency drive. The company, which produces monocrystalline silicon polished wafers and epitaxial wafers for the manufacture of integrated circuits, received approval on Thursday for a listing on the Nasdaq-style Star Market, to raise 4.9 billion yuan (US$682.8 million), according to information released by the Shanghai Stock Exchange. Eswin's capability in supplying 12-inch wafers is seen as an important asset for China's strategic drive towards semiconductor self-sufficiency amid the US-China tech war. 'The era of AI demands greater computing power, faster data transmission, larger data storage, and more responsive human-computer interaction,' the company said in its updated prospectus filed last week. 'To achieve these functional technologies and processes, the most mainstream and advanced logic and memory chips as well as some high-end analogue and sensor chips are manufactured using 12-inch wafers.' The company said it supplied silicon wafers to 'first-tier wafer foundries' and mainstream memory chipmakers in China, without disclosing the customer names. Three years after its founding in Beijing in 2016, Eswin secured the expertise of Wang Dongsheng, the founder of BOE Technology, the world's No 1 display maker and supplier to Apple and Huawei Technologies.


South China Morning Post
2 days ago
- South China Morning Post
Alibaba's new AI agent to ‘revolutionise' how merchants source goods online
Alibaba Group Holding's international commerce arm on Thursday released an artificial intelligence agent to help merchants source products and supplies, a development that could change the way online business is conducted. The Accio Agent unveiled by Alibaba International Digital Commerce Group (AIDC) was designed to 'revolutionise international commerce' by automating 70 per cent of the traditional time-consuming work, including product ideation, prototyping, compliance checks and supplier sourcing, the company said in a statement. The AI agent marks a step forward in the process of 'agentic purchase', which is the use of AI agents to handle everything from product discovery to fulfilment. It could fundamentally change existing models of online search, advertising and e-commerce as tech giants such as roll out their own agents. Alibaba said the agent could reduce weeks of market research and product sourcing work to just a few minutes. This would cut costs and speed up tasks for merchants, some of whom are small and medium-sized businesses run by solo entrepreneurs, enabling them to streamline their operations. Accio Agent was trained on a huge quantity of data. Photo: Handout '[Accio Agent] is designed to help you do business,' Zhang Kuo, vice-president at AIDC, said, adding that it could handle multiple tasks simultaneously and operate like a team of professionals such as sourcing specialists, developers and engineers, and market researchers.