logo
AI experts divided over Apple's research on large reasoning model accuracy

AI experts divided over Apple's research on large reasoning model accuracy

A recent study by tech giant Apple claiming that the accuracy of frontier large reasoning models (LRMs) declines as task complexity increases, and eventually collapses altogether, has led to differing views among experts in the artificial intelligence (AI) world.
The paper titled 'The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity' was published by Apple last week.
Apple, in its paper, said it conducted experiments across diverse puzzles which show that such LRMs face a complete accuracy collapse beyond certain complexities. While their reasoning efforts increase with the complexity of a problem till a point, it then declines despite having an adequate token budget.
A token budget for large language models (LLM) refers to the practice of setting a limit on the number of tokens an LLM can use for a specific task.
The paper is co-authored by Samy Bengio, senior director, AI and ML research at Apple who is also the brother of Yoshua Bengio, often referred to as the godfather of AI.
Meanwhile, AI company Anthropic, backed by Amazon, countered Apple's claims in a separate paper, saying that the 'findings primarily reflect experimental design limitations rather than fundamental reasoning failures.'
'Their central finding has significant implications for AI reasoning research. However, our analysis reveals that these apparent failures stem from experimental design choices rather than inherent model limitations,' it said.
Mayank Gupta, founder of Swift Anytime, currently building an AI product on stealth, told Business Standard that both sides have equally important points.
'What this tells me is that we're still figuring out how to measure reasoning in LRMs the right way. The models are improving rapidly, but our evaluation tools haven't caught up. We need tools that separate how well an LRM reasons from how well it generates output and that's where the real breakthrough lies,' he said.
Gary Marcus, a US academic, who has become a voice of caution on the capabilities of AI models, said in a best case scenario, these models can write python code, supplementing their own weaknesses with outside symbolic code, but even this is not reliable. 'What this means for business and society is that you can't simply drop o3 or Claude into some complex problem and expect it to work reliably,' he wrote in his blog, Marcus on AI.
The Apple researchers conducted experiments comparing thinking and non-thinking model pairs across controlled puzzle environments. 'The most interesting regime is the third regime where problem complexity is higher and the performance of both models have collapsed to zero. Results show that while thinking models delay this collapse, they also ultimately encounter the same fundamental limitations as their non-thinking counterparts,' they wrote.
Apple's observations in the paper perhaps can explain why the iPhone maker has been slow to embed AI across its products or operating systems, a point on which it was criticised at the Worldwide Developers Conference (WWDC) last week. This approach is opposite to the ones adopted by Microsoft-backed OpenAI, Meta, and Google, who are spending billions to build more sophisticated frontier models to solve more complex tasks.
However, there are other voices too who believe that Apple's paper has its limitations.
Ethan Mollick, associate professor at the Wharton School who studies the effects of AI on work, entrepreneurship, and education, mentioned on X that while the limits of reasoning models are useful, it is premature to say that LLMs are hitting a wall.
Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

iPhone 17 Pro price in India could start at Rs….
iPhone 17 Pro price in India could start at Rs….

Hindustan Times

time3 minutes ago

  • Hindustan Times

iPhone 17 Pro price in India could start at Rs….

With the iPhone 17 series, Apple is expected to make big changes with storage variants and pricing. As of now, price hikes for all iPhone 17 models are uncertain; iPhone 17 Pro, on the other hand, could receive a price hike along with a significant increase in base storage. In the latest leak by a Chinese tipster, the iPhone 17 Pro is tipped to have a minimum storage of 256GB, instead of 128GB. With an increase in storage capacity, Apple is also expected to announce $50 price hike in comparison to last year's iPhone 16 Pro model. Therefore, if you are planning to buy the iPhone 17 Pro this year, then expect a change in prices as well as storage options. iPhone 17 Pro price in India could fall above Rs. 1,20,000 for the 256GB variant.(X/@asherdipps) iPhone 17 Pro price A Chinese tipster who goes by the name Instant Digital shared a post on Weibo highlighting the increase in iPhone 17 Pro storage variant and price. Reportedly, the compact pro model will be announced in a 256GB base variant, with a $50 price hike. Therefore, the iPhone 17 Pro could be priced at $1,049 ( around Rs. 91,735). In comparison, the iPhone 16 Pro was launched at $999 for the 128GB variant and Rs. 1,19,900 in India. Whereas, the 256GB costs $1,199 and Rs. 1,29,900. Therefore, we can expect the iPhone 17 Pro price in India could be around Rs. 1,25,000 for 256GB. If these rumours turn out to be true, then iPhone 17 Pro buyers will be getting quite a great deal despite getting an increased storage option. Now, we will have to wait until launch to confirm what Apple has in store for this year's iPhone 17 series. iPhone 17 launch date Several rumours have surfaced about the upcoming Apple event date, and it is speculated to take place on September 9, 2025. An invite has also been leaked on X (formerly Twitter), showcasing a similar launch date. However, we are skeptic about the leaked invite, since Apple invites are sent two weeks before the event date, and we do not expect it to be shared until August 25. Therefore, we may have to wait for an official announcement to confirm when Apple will debut the iPhone 17 series, the new Apple Watch series 11, and other expected products.

xAI co-founder Igor Babuschkin departs to launch AI safety investment firm
xAI co-founder Igor Babuschkin departs to launch AI safety investment firm

The Hindu

time3 minutes ago

  • The Hindu

xAI co-founder Igor Babuschkin departs to launch AI safety investment firm

Igor Babuschkin, a co-founder of Elon Musk's artificial intelligence startup xAI, said on Wednesday that he has left the company and plans to launch an investment firm focused on AI safety research. Musk launched xAI in 2023 to challenge Big Tech's AI push, accusing industry leaders of excessive censorship and lax safety standards. "Today was my last day at xAI," Babuschkin said in a post on X, adding that his new venture, Babuschkin Ventures, will back AI safety research and startups developing the technology. Babuschkin, who previously worked at Google's DeepMind and OpenAI, described xAI's early scramble to build infrastructure and models, saying he created 'many of the foundational tools' for launching and managing training jobs before later overseeing engineering across infrastructure, product and applied AI projects. His departure follows that of xAI's legal head, Robert Keele, earlier this month, and comes amid intensifying competition among AI players such as OpenAI, Google and Anthropic, which are pouring resources into training and deploying advanced systems. Last month, Musk-owned X's CEO Linda Yaccarino also resigned, just months after the platform was folded into xAI. Musk is separately contending with executive departures at Tesla.

Generative AI adoption surge: Banking sector set for efficiency leap by nearly 46%
Generative AI adoption surge: Banking sector set for efficiency leap by nearly 46%

Time of India

time15 minutes ago

  • Time of India

Generative AI adoption surge: Banking sector set for efficiency leap by nearly 46%

AI image Generative Artificial Intelligence (AI) could enhance banking operations in India by as much as 46%, according to a Reserve Bank of India (RBI) report. As per news agency ANI, the central bank said AI could help financial institutions better understand customers, operate more efficiently, and deliver personalised services at scale. 'GenAI is poised to improve banking operations in India by up to 46 per cent,' the RBI noted in its assessment, highlighting that AI adoption in financial services is accelerating to meet varied needs such as enhancing customer experience, improving employee productivity, increasing revenue, cutting costs, ensuring compliance, and driving innovation. As per the report, Generative AI is emerging as a game-changer, using advanced analytics to interpret customer behaviour, strengthen risk management, and streamline expenses. It also pointed to AI-powered alternative credit scoring models that are widening access to credit for people underserved by conventional banking systems. In India, where millions remain outside the formal banking network, AI can assess creditworthiness using non-traditional data sources such as utility payments, mobile usage, GST records, and e-commerce transactions. The RBI observed that this technology can help integrate 'thin-file' or 'new-to-credit' customers into the system. The report also highlighted the growing role of AI chatbots, which are transforming customer service by handling routine queries round the clock, resolving issues faster, and freeing up human staff for more complex cases. On a global scale, the use of AI in financial services is expanding rapidly. The RBI estimated that the generative AI market could surpass Rs 1.02 lakh crore (about $12 billion) by 2033, with an annual growth rate of 28–34%. It added that, when applied responsibly, AI can make banking more inclusive, efficient and customer-friendly, while fuelling long-term sector growth. Stay informed with the latest business news, updates on bank holidays , public holidays , current gold rate and silver price .

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store