Apple Researchers Just Released a Damning Paper That Pours Water on the Entire AI Industry

09-06-2025

Researchers at Apple have released an eyebrow-raising paper that throws cold water on the "reasoning" capabilities of the latest, most powerful large language models.
In the paper, a team of machine learning experts makes the case that the AI industry is grossly overstating the ability of its top AI models, including OpenAI's o3, Anthropic's Claude 3.7, and Google's Gemini.
In particular, the researchers assail the claims of companies like OpenAI that their most advanced models can now "reason" — a supposed capability that the Sam Altman-led company has increasingly leaned on over the past year for marketing purposes — which the Apple team characterizes as merely an "illusion of thinking."
It's a particularly noteworthy finding, considering Apple has been accused of falling far behind the competition in the AI space. The company has chosen a far more careful path to integrating the tech in its consumer-facing products — with some seriously mixed results so far.
In theory, reasoning models break down user prompts into pieces and use sequential "chain of thought" steps to arrive at their answers. But now, Apple's own top minds are questioning whether frontier AI models simply aren't as good at "thinking" as they're being made out to be.
"While these models demonstrate improved performance on reasoning benchmarks, their fundamental capabilities, scaling properties, and limitations remain insufficiently understood," the team wrote in its paper.
The authors — who include Samy Bengio, the director of Artificial Intelligence and Machine Learning Research at the software and hardware giant — argue that the existing approach to benchmarking "often suffers from data contamination and does not provide insights into the reasoning traces' structure and quality."
By using "controllable puzzle environments," the team estimated the AI models' ability to "think" — and made a seemingly damning discovery.
"Through extensive experimentation across diverse puzzles, we show that frontier [large reasoning models] face a complete accuracy collapse beyond certain complexities," they wrote.
Thanks to a "counter-intuitive scaling limit," the AIs' reasoning abilities "declines despite having an adequate token budget."
Put simply, even with sufficient training, the models are struggling with problem beyond a certain threshold of complexity — the result of "an 'overthinking' phenomenon," in the paper's phrasing.
The finding is reminiscent of a broader trend. Benchmarks have shown that the latest generation of reasoning models is more prone to hallucinating, not less, indicating the tech may now be heading in the wrong direction in a key way.
Exactly how reasoning models choose which path to take remains surprisingly murky, the Apple researchers found.
"We found that LRMs have limitations in exact computation," the team concluded in its paper. "They fail to use explicit algorithms and reason inconsistently across puzzles."
The researchers claim their findings raise "crucial questions" about the current crop of AI models' "true reasoning capabilities," undercutting a much-hyped new avenue in the burgeoning industry.
That's despite tens of billions of dollars being poured into the tech's development, with the likes of OpenAI, Google, and Meta, constructing enormous data centers to run increasingly power-hungry AI models.
Could the Apple researchers' finding be yet another canary in the coalmine, suggesting the tech has "hit a wall"?
Or is the company trying to hedge its bets, calling out its outperforming competition as it lags behind, as some have suggested?
It's certainly a surprising conclusion, considering Apple's precarious positioning in the AI industry: at the same time that its researchers are trashing the tech's current trajectory, it's promised a suite of Apple Intelligence tools for its devices like the iPhone and MacBook.
"These insights challenge prevailing assumptions about LRM capabilities and suggest that current approaches may be encountering fundamental barriers to generalizable reasoning," the paper reads.
More on AI models: Car Dealerships Are Replacing Phone Staff With AI Voice Agents

Hashtags

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

TechCrunch

12 minutes ago

TechCrunch

GPT-5 is supposed to be nicer now

In Brief OpenAI announced late Friday that it's updating its latest model to be 'warmer and friendlier.' The company recently launched the much-anticipated GPT-5 in a process that CEO Sam Altman admitted was 'a little more bumpy than we'd hoped for,' with some users complaining that they preferred the previous model, GPT-4o. OpenAI is trying to address some of those complaints with this update, with changes that it says are 'subtle' but will make GPT-5 'more approachable now.' 'You'll notice small, genuine touches like 'Good question' or 'Great start,' not flattery,' the company wrote in a social media post. 'Internal tests show no rise in sycophancy compared to the previous GPT-5 personality.' At a dinner this week with journalists, OpenAI executives tried to focus on the company's plans beyond GPT-5, but as Max Zeff reports, the rocky launch was the elephant in the room. As far as model friendliness goes, VP Nick Turley said that the GPT-5 was 'just very to the point,' but that the new update would — as now announced — make it feel warmer.

Apple's Robot Plans Resurface—Here's the Latest

Yahoo

an hour ago

Yahoo

Apple's Robot Plans Resurface—Here's the Latest

Apple is once again exploring robotics behind closed doors, as the company continues to shift focus from Apple Intelligence. According to a recent Bloomberg report, the company is internally evaluating several AI-powered hardware concepts, including smart home displays, security devices, and a tabletop robot that could use facial recognition and motorized movement to interact with users. None of the devices are officially in development, and sources caution they may never reach the market. One prototype, codenamed J595 and purportedly targeted for a 2027 launch, features a swiveling screen mounted on a robotic arm. It's been nicknamed the 'Pixar Lamp,' a nod to the animation studio and the expressive, lifelike motion of its mascot. The robot is envisioned as a more personal version of a smart assistant—able to track users during video calls or respond physically to conversations. Apple is also exploring mobile bots with wheels and humanoid robots for industrial use. From Sex Work to Space Exploration: Six Jobs Robots Are About to Take Over 'Apple has long been great at integrating hardware and software, and at human interface too,' Gary Marcus, an AI authority and professor emeritus of psychology and neural science at New York University, told Decrypt. 'I don't personally think that reliable humanoid domestic robots are at all close to hand, but if I ever buy a humanoid for the home, I hope it will come with Apple's care for privacy, reliability, elegance, security, and thoughtful design.' Rumors around Apple launching a line of robots emerged last year as Apple made a series of AI-focused announcements and upgrades. In February, longtime Apple analyst Ming-Chi Kuo said Apple is exploring both humanoid and non-humanoid robots 'for its future smart home ecosystem, and these products are still in the early proof-of-concept (POC) stage.' At a recent all-hands meeting, CEO Tim Cook reportedly told employees that Apple needs to 'win in AI,' describing the company's product pipeline as 'amazing' and hinting that some devices would be revealed soon, while others remain further out. He did not mention robotics specifically. The goal is to make artificial intelligence feel physically present. While the robot is still in early development, it represents the centerpiece of a broader push to put Apple back in the AI arms race. A home display for smart automation, video calls, and an upgraded Siri that can engage in conversations with users is reportedly further along and could debut in 2026. Both the display and the robot would run a new software platform internally dubbed 'Charismatic,' designed to handle voice-first commands, facial recognition, and personalized content. Apple did not respond to Decrypt's request for comment. Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

Jamf Holding Corp. (JAMF): A Bull Case Theory

Yahoo

an hour ago

Yahoo

Jamf Holding Corp. (JAMF): A Bull Case Theory

We came across a bullish thesis on Jamf Holding Corp. on Value investing subreddit by danieljapps. In this article, we will summarize the bulls' thesis on JAMF. Jamf Holding Corp.'s share was trading at $7.91 as of August 8th. JAMF's forward P/E was 8.84 according to Yahoo Finance. JAMF is a leading software company specializing in device management for Apple products, offering a unique full-stack solution that connects, manages, and protects Apple devices across organizations. It dominates the market, serving 21 of Forbes' 25 most valuable companies, eight of the Fortune 500's top ten, and all 15 of the world's largest banks. Even Apple itself uses JAMF to manage its own devices, underscoring its critical role in the ecosystem. Despite a modest market capitalization of $941 million, JAMF's tangible assets and cash exceed its liabilities by approximately $969 million, indicating the company is trading below its intrinsic value. This undervaluation is notable given its robust fundamentals and growth prospects. JAMF recently posted its first profit of $0.5 million, a small but significant milestone following consistent annual revenue growth of at least 10%. The company boasts a strong gross margin of 79%, which is expected to improve further following a planned 6.4% workforce reduction aimed at cost-cutting and profit enhancement. Management's confidence is evident, having pre-announced that Q2 2025 results will surpass the highest end of guidance, a rare and optimistic signal ahead of earnings. With Apple's increasing enterprise presence expanding JAMF's addressable market, the company is positioned for sustained growth. Although past profitability has been limited due to upfront software development and marketing expenses, JAMF's improving financial discipline and market leadership present a compelling risk/reward opportunity. Investors should consider this stock as a highly attractive entry point with potential upside of 100% to 200% over the coming months, driven by accelerating revenue growth, margin expansion, and an undervalued share price. Previously, we covered a bullish thesis on Amplitude, Inc. by sketchfag in February 2025, which highlighted its leadership in product-led growth and strong market position despite near 52-week lows. The stock has depreciated approximately 20% since then, reflecting broader market challenges. The thesis still stands as Amplitude continues to innovate in analytics. Danieljapps shares a similar bullish thesis on Jamf Holding Corp., focusing on its Apple device management dominance and improving profitability. Jamf Holding Corp. is not on our list of the 30 Most Popular Stocks Among Hedge Funds. As per our database, 27 hedge fund portfolios held JAMF at the end of the first quarter which was 25 in the previous quarter. While we acknowledge the potential of JAMF as an investment, we believe certain AI stocks offer greater upside potential and carry less downside risk. If you're looking for an extremely undervalued AI stock that also stands to benefit significantly from Trump-era tariffs and the onshoring trend, see our free report on the best short-term AI stock. READ NEXT: 8 Best Wide Moat Stocks to Buy Now and 30 Most Important AI Stocks According to BlackRock. Disclosure: None.

Apple Researchers Just Released a Damning Paper That Pours Water on the Entire AI Industry

Hashtags

Try Our AI Features

Comments

Related Articles

GPT-5 is supposed to be nicer now

Apple's Robot Plans Resurface—Here's the Latest

Jamf Holding Corp. (JAMF): A Bull Case Theory

Get Started Now: Download the App