
Apple researchers say models like ChatGPT o3 look smart but collapse when faced with real complexity
They may talk the talk, but can they truly think it through? A new study by Apple researchers suggests that even the most advanced AI models like ChatGPT o3, Claude, and DeepSeek start to unravel when the going gets tough. These so-called 'reasoning' models may impress with confident answers and detailed explanations, but when faced with genuinely complex problems, they stumble – and sometimes fall flat. advertisementApple researchers have found that the most advanced large language models today may not be reasoning in the way many believe. In a recently released paper titled The Illusion of Thinking, researchers at Apple show that while these models appear intelligent on the surface, their performance dramatically collapses when they are faced with truly complex problems.The study looked at a class of models now referred to as Large Reasoning Models (LRMs), which are designed to "think" through complex tasks using a series of internal steps, often called a 'chain of thought.' This includes models like OpenAI's o3, DeepSeek-R1, and Claude 3.7 Sonnet Thinking. Apple's researchers tested how these models handle problems of increasing difficulty – not just whether they arrive at the correct answer, but how they reason their way there.advertisement
The findings were striking. As problem complexity rose, the models' performance did not apparently degrade gracefully – it collapsed completely. 'They think more up to a point,' tweeted tech critique Josh Wolfe, referring to the findings. 'Then they give up early, even when they have plenty of compute left.'
Apple's team built custom puzzle environments such as the Tower of Hanoi, River Crossing, and Blocks World to carefully control complexity levels. These setups allowed them to observe not only whether the models found the right answer, but how they tried to get there.They found that:-At low complexity, traditional LLMs (without reasoning chains) performed better and were more efficient-At medium complexity, reasoning models briefly took the lead-At high complexity, both types failed completelyEven when given a step-by-step algorithm for solving a problem, so that they only needed to follow instructions, models still made critical mistakes. This suggests that they struggle not only with creativity or problem-solving, but with basic logical execution. The models also showed odd behaviour when it came to how much effort they put in. Initially, they 'thought' more as the problems got harder, using more tokens for reasoning steps. But once a certain threshold was reached, they abruptly started thinking less. This happened even when they hadn't hit any computational limits, highlighting what Apple calls a 'fundamental inference time scaling limitation.'advertisementCognitive scientist Gary Marcus said the paper supports what he's been arguing for decades: these systems don't generalise well beyond their training data. 'Neural networks can generalise within a training distribution of data they are exposed to, but their generalisation tends to break down outside that distribution,' Marcus wrote on Substack. He also noted that the models' 'reasoning traces' – the steps they take to reach an answer – can look convincing, but often don't reflect what the models actually did to reach a conclusion.Arizona State University's Subbarao (Rao) Kambhampati, whose previous work has critiqued so-called reasoning models, was also echoed in Apple's findings, points out Marcus. Rao has shown that models often appear to think logically but actually produce answers that don't match their thought process. Apple's experiments back this up by showing models generate long reasoning paths that still lead to the wrong answer, particularly as problems get harder.advertisementPerhaps the most damning evidence came when Apple tested whether models could follow exact instructions. In one test, they were handed the algorithm to solve the Tower of Hanoi puzzle and asked to just execute it. The models still failed once the puzzle complexity passed a certain point.Apple's conclusion is blunt: today's top models are 'super expensive pattern matchers' that can mimic reasoning only within familiar settings. The moment they're faced with novel problems – ones just outside their training data – they crumble.These findings have serious implications for claims that AI is becoming capable of human-like reasoning. As the paper puts it, the current approach may be hitting a wall, and overcoming it could require an entirely different way of thinking about how we build intelligent systems. In short, we are still leaps away from AGI.
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles


Time of India
39 minutes ago
- Time of India
Apple to unveil iOS 26: Expected to come with AI updates, battery optimizations, and more
Apple to unveil iOS 26: As iOS 26 gets ready to launch at WWDC 2025, followers of the company are excited about what could be the most ambitious release in years. In addition to AI-powered capabilities like energy optimization, real-time AirPods translation, and a possible gaming area, leaked rumors suggest a striking "Liquid Glass" revamp that draws inspiration from visionOS. Renaming iOS 19 to iOS 26 and switching to year-based changing represents a strategic change meant to streamline branding throughout the Apple ecosystem. Developers and consumers alike are eagerly awaiting Apple's reimagining of the iPhone, iPad, macOS, and watchOS user experience. If the leaks are accurate, iOS 26 may change the way we use our gadgets. Why is Apple changing its name from iOS 19 to iOS 26? In order to coincide with 2026, when the software will be extensively utilized, Apple is moving ahead from iOS 18 to iOS 26. This move isn't limited to iOS; watchOS, macOS, and iPadOS will all use comparable naming conventions. When is iOS 26 coming? The final version of iOS 26 should be released on September 16, 2025, probably one week following the release of the iPhone 17 series. According to historical patterns, Apple usually releases new iPhone models on the second Tuesday in September, followed soon after by the iOS upgrade. iOS 26 features It is anticipated that iOS 26 will have a comprehensive visual makeover modeled after Apple's vision. Anticipate a revised Control Center with new sliders, a glass-like UI, and icons with rounded corners. Context-aware user interface enhancements may also be made to native apps like Camera. Updates on Apple AI and Intelligence Following iOS 18's delays, iOS 26 will concentrate on enhancing already-available Apple Intelligence products like Image Playground, Genmoji, and Writing Tools. The more intelligent version of Siri might not be available for months or even years. To further enhance AI capabilities on iPhones, Apple is apparently in discussions with Google to include Gemini AI features straight into Siri. Also Read: How to Watch WWDC 2025 in India, Dubai, and USA?: From iOS 26 to macOS 26 Tahoe, Apple's new upcoming software Performance and Battery Enhancement iOS 26 will use AI to tailor battery optimization to each user's unique usage habits. This would be especially helpful for the iPhone 17 Air model with a tiny battery that is anticipated later this year. iOS 26 Beta program: Timeline & how to join Following the WWDC keynote on June 9, Apple is anticipated to release the first developer beta of iOS 26 right away. By mid-July, public betas will be available. Anticipated release of the iOS 26 beta: Beta 1 for developers: June 10, 2025 First public beta date: July 15, 2025 Date of Final Publication: September 16, 2025 To sign up for the beta program: Make a backup of your iPhone. Navigate to Software Update > General > Settings > Beta Updates. Select between the public beta and the developer version. Install and download when it's available. iOS 26 coming to these smartphones The iPhone XS, XS Max, and XR may be discontinued because of aging technology, even if iOS 26 is said to support all smartphones running iOS 18. The 2018 A12 Bionic chip powers these models. Nevertheless, the following devices are anticipated to support iOS 26: iPhone 17 series, iPhone 16 series, including iPhone 16e, Series iPhone SE (second generation and after), and iPhone 15/14/13/12/11. For the latest and more interesting tech news, keep reading Indiatimes Tech.


Time of India
41 minutes ago
- Time of India
Rednote joins wave of Chinese firms releasing open-source AI models
China's Rednote , one of the country's most popular social media platforms, has released an open-source large language model , joining a wave of Chinese tech firms making their artificial intelligence models freely available. The approach contrasts with many U.S. tech giants like OpenAI and Google, which have kept their most advanced models proprietary, though some American firms including Meta have also released open-source models. Open sourcing allows Chinese companies to demonstrate their technological capabilities, build developer communities and spread influence globally at a time when the US has sought to stymie China's tech progress with export restrictions on advanced semiconductors. Rednote's model, called is available for download on developer platform Hugging Face. A company technical paper describing it was uploaded on Friday. In coding tasks, the model performs comparably to Alibaba 's Qwen 2.5 series, though it trails more advanced models such as DeepSeek-V3, the technical paper said. Live Events RedNote, also known by its Chinese name Xiaohongshu, is an Instagram-like platform where users share photos, videos, text posts and live streams. The platform gained international attention earlier this year when some U.S. users flocked to the app amid concerns over a potential TikTok ban. Discover the stories of your interest Blockchain 5 Stories Cyber-safety 7 Stories Fintech 9 Stories E-comm 9 Stories ML 8 Stories Edtech 6 Stories The company has invested in large language model development since 2023, not long after OpenAI's release of ChatGPT in late 2022. It has accelerated its AI efforts in recent months, launching Diandian, an AI-powered search application that helps users find content on Xiaohongshu's main platform. Other companies that are pursuing an open-source approach include Alibaba which launched Qwen 3 , an upgraded version of its model in April. Earlier this year, startup DeepSeek released its low-cost R1 model as open-source software, shaking up the global AI industry due to its competitive performance despite being developed at a fraction of the cost of Western rivals.


India Today
42 minutes ago
- India Today
iPhone 17 Pro vs iPhone 16 Pro: 5 big upgrades expected
iPhone 17 Pro vs iPhone 16 Pro: 5 big upgrades expected By Ankita Garg The iPhone 17 series is likely to launch in September 2025, which is 3 months away from now. It is said to get several upgrades in comparison to the iPhone 16 Pro. Here are the details based on leaks. Intro Photo generated using AI The iPhone 17 Pro is tipped to run on the next-generation A19 Pro chipset, which would be an upgrade over the A18 Pro chip seen on the previous model. Chipset Photo : Apple Hub A new vapor chamber cooling system is said to be introduced with the iPhone 17 Pro. This would help the phone manage heat more effectively during intensive tasks or prolonged use. New cooling system? Interestingly, Apple may switch back to an aluminum frame for the iPhone 17 Pro, moving away from the premium Grade 5 titanium used in the 16 Pro. While this could help with weight reduction, it may be seen as a step down in terms of durability and finish. Design The rear camera design is also expected to see a drastic change. While the iPhone 16 Pro sticks with a square-shaped camera island, the 17 Pro might go for a large rectangular module that stretches nearly edge-to-edge like the Pixel 9 Pro. A fresh look? Photo : Front Page Tech The iPhone 17 Pro could feature a 48MP telephoto sensor, replacing the 12MP periscope zoom camera used in the iPhone 16 Pro. However, optical zoom is said to be limited to 3.5x this time, slightly lower than the 5x zoom capability of the 16 Pro. Camera Photo : @Asherdipps Apple is expected to bump up the RAM to 12GB on the iPhone 17 Pro, up from 8GB on the iPhone 16 Pro. This should improve multitasking and boost performance during heavy tasks like gaming or video editing. Also Read: iPhone 16 Pro Max and iPhone 16 Pro review: Kings in waiting RAM