logo
Apple researchers find ‘major' flaws in AI reasoning models ahead of WWDC 2025

Apple researchers find ‘major' flaws in AI reasoning models ahead of WWDC 2025

Time of India4 hours ago

A newly published
Apple Machine Learning Research
study has challenged the prevailing idea that large-language models (LLMs) like OpenAI's o1 and Claude's thinking variants truly possess "reasoning" capabilities. The study indicates fundamental limitations in these AI systems. For this study, Apple researchers designed controllable puzzle environments, such as the Tower of Hanoi and the River Crossing. This approach avoided standard math benchmarks, which are susceptible to data contamination. According to the researchers, these custom environments allowed for a precise analysis of both the final answers produced by the LLMs and their internal reasoning traces across different complexity levels.
What Apple researchers have found out from this study
According to a report by MacRumors, the reasoning models tested by Apple's Research team, including o3-mini, DeepSeek-R1, and Claude 3.7 Sonnet, saw their accuracy collapse entirely once problem complexity crossed certain thresholds.
Success rates dropped to zero even though the models had sufficient computational resources. Surprisingly, as problems became harder, the models reduced their reasoning effort. This points to fundamental scaling limitations rather than a lack of resources.
Even more revealing, the models still failed at the same complexity points even when researchers provided complete solution algorithms. This indicates that the limitation lies in basic logical step execution, not in choosing the right problem-solving strategy.
The models also showed puzzling inconsistencies. They were able to solve problems requiring over 100 moves but failed on simpler puzzles that needed only 11 moves.
The study identified three performance patterns. Standard models unexpectedly performed better than reasoning models on low-complexity problems. Reasoning models had an advantage at medium complexity. Both types failed at high complexity.
Researchers also discovered that models exhibited inefficient "overthinking" patterns, often discovering correct solutions early but wasting computational effort exploring incorrect alternatives.
The key takeaway is that current "reasoning" models rely heavily on advanced pattern matching, not true reasoning. These models do not scale their reasoning the way humans do. They tend to overthink easy problems and think less when faced with harder ones.
It is worth noting that this research surfaced just days before WWDC 2025. According to Bloomberg, Apple is expected to focus on new software designs rather than headline-grabbing AI features at this year's event.
AI Masterclass for Students. Upskill Young Ones Today!– Join Now

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Apple WWDC 2025 Highlights - Check everything that happened
Apple WWDC 2025 Highlights - Check everything that happened

Time of India

time33 minutes ago

  • Time of India

Apple WWDC 2025 Highlights - Check everything that happened

Apple at WWDC 2025 has unveiled upgrades to operating systems across its devices on Monday, including overhauled visual elements, a fresh naming system for software updates and new features in its Apple Intelligence suite. At its annual Worldwide Developers Conference, the company also said it would open up the underlying technology it uses for Apple Intelligence to developers. Operating Systems This year's major iOS release would have originally been called iOS 19, following the usual sequence after iOS 18. However, Apple is now changing its naming convention: future iOS versions will be numbered based on the year following their release—similar to how car manufacturers name new models. by Taboola by Taboola Sponsored Links Sponsored Links Promoted Links Promoted Links You May Like Costco Shoppers Say This Wrinkle Cream Is "Actually Worth It" The Skincare Magazine Undo Several parts of the operating systems are getting a major visual overhaul as part of the redesign. The Phone app now includes call screening, allowing it to answer calls or wait on hold for you. The Messages app is also getting updates that include customizable chat backgrounds. Apple also said it would add generative AI to its Xcode coding tools that can help developers write code, test it and resolve errors. The company said it would add other coding models such as ChatGPT to Xcode. Live Events Apple AI New additions to the operating system include Live Translation, which uses on-device AI models to translate conversations in real time, in text messages, phone calls or FaceTime. Apple Pay is also getting Apple Intelligence integration, enabling it to track orders even for purchases made outside Apple Pay. Meanwhile, Image Playground is getting a boost with a new feature that allows users to generate images with the help of OpenAI's ChatGPT. Apple will now allow developers to tap into its on-device foundational model for their own apps. Through the new Foundation Models framework, developers can build intelligent, privacy-focused experiences that work offline too. Apple Visual Intelligence Apple will also let users learn more about what's on their iPhone screens via Visual Intelligence. Users can search across Google, Etsy and other supported apps to find visually similar images or products. If the tool detects that you're viewing an event, iOS 26 will suggest adding it to your calendar. This feature will be accessible using the same button combination used to take a screenshot on an iPhone. Apple Liquid Glass Apple is rolling out a new "Liquid Glass" design language across its software, bringing sleek translucence and a glass-like shine to app interfaces. Inspired by visionOS on the Vision Pro augmented reality device, the design adapts to light and dark modes and reacts dynamically to movement using real-time rendering. The new design will be implemented in buttons, sliders, media controls and larger elements such as tab bars and sidebars, along with matching redesigned toolbars and navigation. Apple is releasing updated Application Programming Interfaces so that developers can begin adapting their apps ahead of the new design rollout later this year. FAQs Q1. When did Apple WWDC 2025 happen? A1. Apple WWDC 2025 happened on June 9, 2025. Q2. Is there any new announcement in Apple WWDC 2025? A2. Yes, there are new announcements in Apple WWDC 2025.

Apple unveils watchOS 26: Smarter fitness, one-hand gestures, and real-time translations
Apple unveils watchOS 26: Smarter fitness, one-hand gestures, and real-time translations

Mint

time34 minutes ago

  • Mint

Apple unveils watchOS 26: Smarter fitness, one-hand gestures, and real-time translations

Apple has announced watchOS 26 at its WWDC 2025 on Tuesday, the latest version of its Apple Watch software. The update brings a new design and smarter features focused on fitness, messaging, and daily use. watchOS 26 introduces a new design style called Liquid Glass. This adds smooth, transparent effects to parts of the screen like widgets, notifications, and the Control Centre. The changes make apps look more modern while keeping the layout easy to use. The Photos watch face also gets a new look, with numbers made of Liquid Glass to better highlight pictures. You may be interested in One of the biggest new features is Workout Buddy. This tool uses Apple Intelligence to act like a virtual coach. It gives spoken feedback and motivation during workouts, using your own fitness data, such as heart rate, distance, and progress on your Activity rings. For example, it might say, 'You're 18 minutes away from closing your Exercise ring,' or, 'That was your longest run this month.' The voice is generated using AI, and based on voices of real Fitness+ trainers. Workout Buddy supports common workouts like running, walking, cycling, HIIT, and strength training. It works in English for now, and needs a supported iPhone nearby with Bluetooth headphones. The Workout app now has a new layout, making it easier to start and control workouts. Four new buttons let users quickly access features like Custom Workouts and Race Route. You can also set music or podcasts to play automatically when a workout starts. Apple Music will suggest playlists based on your workout type and what you usually listen to. watchOS 26 adds a new 'wrist flick' gesture. If you lift your wrist to check a notification but don't want to deal with it, you can flick your wrist to dismiss it. This works for calls, alarms, and timers too. It uses sensors and AI to understand the movement. The watch will also now adjust the sound of alerts based on how noisy your surroundings are, helping you stay aware without disturbing others. Apple Watch now supports live translation in the Messages app. This means incoming texts can be translated into your chosen language, and your replies can also be translated back. This will work on newer Apple Watch models, if used with a supported iPhone. Messages will also suggest smart actions — for example, offering to start a Check In if someone asks you to let them know when you reach home, or suggesting Apple Cash if you're asked to pitch in for a gift. Smart Stack, the scrollable group of widgets, now gives better suggestions based on your habits, location, and activity. For instance, it might remind you to start a workout when you arrive at the gym. The Notes app is now available on Apple Watch. You can view, pin, and create notes using your voice or the keyboard. The Photos face will now show more meaningful images from your library. For users who are deaf or hard of hearing, Live Listen is now easier to use from the Watch. You can start or stop listening sessions on a paired iPhone, and see real-time captions on your wrist. New tools in the Phone app, like Hold Assist and Call Screening, help manage calls. Hold Assist lets you know when a real person joins the line during a support call. Call Screening checks unknown callers by asking for their name and reason for calling before ringing your phone. watchOS 26 is available now for developers and will have a public beta next month. The full release is expected later this year. It will be a free update for Apple Watch Series 6 or newer, including the second-gen SE and Ultra models. Some features, especially those using Apple Intelligence, need newer iPhones like the iPhone 15 Pro or iPhone 16 models. Not all features will be available in every region or language, and Apple says details could change before launch.

Apple plays it safe on AI despite Wall Street pressure
Apple plays it safe on AI despite Wall Street pressure

Mint

timean hour ago

  • Mint

Apple plays it safe on AI despite Wall Street pressure

Apple on Monday remained on its cautious path to embracing generative AI even as rivals race ahead with the technology and Wall Street expresses doubts over its strategy. The pressure was on Apple not to disappoint at its annual Worldwide Developers Conference (WWDC) a year after the iPhone juggernaut made a promise it failed to keep -- to improve its Siri voice assistant with generative AI. The annual WWDC is addressed to developers who build apps and tools to run on the company's products. Despite last year's disappointment, Apple insisted on Monday it was still very much in the AI race, announcing incremental updates to its Apple Intelligence software, including the ability for app makers to directly access a device's AI capabilities. This would allow users to engage with apps using generative AI while offline, letting them interact ChatGPT-style with a hiking app, for example, while in remote areas without a connection. Apple CEO Tim Cook briefly mentioned that Siri's AI makeover was still under development and "needed more time to meet our high quality bar," which includes Apple's standards on privacy and data security. "We are making progress, and we look forward to getting these features into customers' hands," he added. For Gadjo Sevilla, senior analyst for Emarketer, "the delays to Apple's in-house AI efforts will continue to draw scrutiny." "Especially since rivals like Google and Samsung are moving ahead by introducing new on-device AI capabilities, or partnering with AI startups like Perplexity (in Samsung's case) to provide users with AI features," he added. The biggest announcement at the event was the renaming of Apple's operating systems so that releases better match their release year. The next operating system will be iOS 26 and will be available across all of Apple's devices -- including the Mac, Watch and Vision Pro headset -- in the fall, in time for the likely release of the next iPhone 17. Today, Apple's operating systems have vastly different nomenclatures across devices, including the current iOS 18 for the iPhone or macOS 15 for Mac computers. Apple also announced that the new operating system will be the first major iOS redesign since 2013, calling the new look "Liquid Glass." The relationship between Apple and app-making developers has been strained in recent years, with developers chafing at the iPhone maker's high fees for getting access to the App Store. A marathon lawsuit by Fortnite maker Epic Games ended with Apple being ordered to allow outside payment systems to be used in the US App Store. Adding to doubts about Apple's direction is the fact that the legendary designer behind the iPhone, Jony Ive, has joined with ChatGPT maker OpenAI to create a potential rival device for engaging with AI. Apple also has to deal with tariffs imposed by US President Donald Trump in his trade war with China, a key market for sales growth and the place where most iPhones are manufactured. Trump has also threatened to hit Apple with tariffs if iPhone production wasn't moved to the US, a change which analysts say would be impossible given the costs and capabilities required. Wall Street analysts remain divided on Apple's prospects, with the stock down about 17 percent since the start of the year, wiping over $600 billion from its market value and far outshone by its Big Tech rivals. While some analysts remain optimistic about Apple's long-term AI monetization potential, others worry the company's cautious approach may prove costly in the longer term. WWDC "was void of any major Apple Intelligence progress as Cupertino is playing it safe and close to the vest after the missteps last year," said Dan Ives of Wedbush Securities. "We have a high level of confidence Apple can get this right, but they have a tight window to figure this out," he added.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store