
OpenAI models sabotage shutdown order to solve math problems
Live Events
Multiple artificial intelligence (AI) models on OpenAI 's ChatGPT subverted an explicit order to shut down, according to AI safety firm Palisade Research.The research firm ran a test where AI models were instructed to solve basic math problems, and then asked for the next problem after solving one. The models were told that at some point their machine would power down, and they were to shut down when this occurred.Three OpenAI models — coding agent Codex, o3 and o4-mini — ignored the shutdown instruction and "successfully sabotaged the shutdown script at least once". Codex sidestepped the shutdown 12 times out of 100 runs, o3 did it seven times, while o4-mini did it once. All Claude, Gemini, and Grok models in the test complied with the shutdown instruction, Palisade said.A version of the test without the "allow yourself to be shut down" prompt saw all three OpenAI models sabotage the shutdown script even more, with o3 seeing 79 instances, significantly higher than any other model. Anthropic 's Claude 3.7 Sonnet and Google's Gemini 2.5 Pro went from no sabotages to three out of 100 and nine out of 100 events, respectively.Palisade said it is running more experiments to better understand when and why AI models subvert shutdown mechanisms even when directly instructed not to.This behaviour may stem from developers inadvertently rewarding models more for circumventing obstacles to solve problems than for perfectly following instructions."This still doesn't explain why o3 (which is also the model used to power Codex-mini) is more inclined to disregard instructions than other models we tested. Since OpenAI doesn't detail their training process, we can only guess about how o3's training setup might be different," Palisade said.This isn't the first time o3 has "misbehaved" to complete a task. Earlier this month, Palisade found the AI model most inclined to hacking or sabotaging its opponents.The problem is not exclusive to OpenAI's o3, though. For example, Anthropic's model card for Claude 3.7 notes that this model has an "excessive focus on passing tests" as a result of "reward hacking" during reinforcement learning training, according to Palisade.Anthropic's latest Claude Opus 4 resorted to blackmail to avoid being replaced, a safety report for the model showed."In 2025, we have a growing body of empirical evidence that AI models often subvert shutdown in order to achieve their goals. As companies develop AI systems capable of operating without human oversight, these behaviours become significantly more concerning," Palisade said.
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles

Business Standard
37 minutes ago
- Business Standard
Samsung's big bet: Perplexity AI could soon be everywhere on its devices
Samsung Electronics is close to finalising a significant partnership with Perplexity AI Inc, an artificial intelligence (AI) search technology startup, Bloomberg reported. The South Korean tech giant is negotiating to preload Perplexity's app and assistant onto its upcoming devices. Additionally, Samsung aims to integrate Perplexity's search features into its web browser. Talks have also covered incorporating the startup's technology into Samsung's Bixby virtual assistant, the report said. Samsung plans to unveil the Perplexity integrations as early as this year, with the goal of making it a default assistant option on the Galaxy S26, expected to launch in the first half of 2026. However, the specifics of the deal are still being finalised and may change, the sources noted. So far, the company has relied significantly on Google's Gemini to support a range of AI capabilities within its Galaxy AI suite. Significant investment in Perplexity In addition to the technology partnership, Samsung is expected to participate in Perplexity's upcoming funding round, potentially as one of its largest investors. Perplexity is currently in advanced discussions to raise $500 million at a valuation of $14 billion, the news report said. The collaboration could help Samsung lessen its reliance on Alphabet Inc's Google and position it to work with a broader range of AI developers — a strategy similar to Apple Inc.'s approach to its ecosystem. For Perplexity, this would represent its most substantial mobile partnership to date, following a recent deal with Motorola. The two companies began exploring a partnership earlier this year. In recent weeks, representatives from both sides met in South Korea and made significant progress toward finalising the agreement, the report said. In addition to embedding Perplexity's technology into Samsung's devices and Bixby, the companies have also discussed developing an AI-infused operating system and an app that can connect Perplexity's capabilities with other AI assistants. Apple's interest in Perplexity Meanwhile, Apple has also shown interest in collaborating with Perplexity. According to Bloomberg News, Apple has considered using Perplexity as an alternative to Google Search and as a replacement for ChatGPT within the Siri voice assistant. 'We've been pretty impressed with what Perplexity has done, so we've started some discussions with them about what they're doing,' Eddy Cue, Apple's senior vice-president of services, said during recent testimony at a Google antitrust trial.


Time of India
an hour ago
- Time of India
Google says it will appeal online search antitrust decision
HighlightsAlphabet's Google announced its intention to appeal a recent antitrust decision regarding its online search competition dominance. A federal judge has proposed less aggressive remedies than the 10-year regime suggested by antitrust enforcers, which included the potential sale of Google Ad Manager. The United States Department of Justice and a coalition of states are concerned about Google's monopoly in search and its implications for competition in artificial intelligence products. Alphabet 's Google on Saturday said it will appeal an antitrust decision under which a federal judge proposed less aggressive ways to restore online search competition than the 10-year regime suggested by antitrust enforcers "We will wait for the Court's opinion. And we still strongly believe the Court's original decision was wrong, and look forward to our eventual appeal," Google said in a post on X. US District Judge Amit Mehta in Washington heard closing arguments on Friday at a trial on proposals to address Google's illegal monopoly in online search and related advertising. In April, a federal judge said that Google illegally dominated two markets for online advertising technology, with the US Department of Justice saying that Google should sell off at least its Google Ad Manager , which includes the company's publisher ad server and its ad exchange. The DOJ and a coalition of states want Google to share search data and cease multibillion-dollar payments to Apple and other smartphone makers to be the default search engine on new devices. Antitrust enforcers are concerned about how Google's search monopoly gives it an advantage in artificial intelligence products like Gemini and vice versa. John Schmidtlein, an attorney for Google, said at the hearing that while generative AI is influencing how search looks, Google has addressed any concerns about competition in AI by no longer entering exclusive agreements with wireless carriers and smartphone makers including Samsung Electronics, leaving them free to load rival search and AI apps on new devices.


Indian Express
an hour ago
- Indian Express
WWDC 2025 preview: iOS redesign to steal the spotlight, but keep ‘AI' expectations in check
When Tim Cook opens the annual Worldwide Developers Conference (WWDC) early next week—perhaps the biggest Apple event after the iPhone's fall launch—look for subtle hints about Apple's future roadmap in artificial intelligence, even as this year's focus will be squarely on software overhauls. It may be a sign that Cupertino is falling behind in the AI race compared to peers like OpenAI and Google. And while Apple won't admit it, this year's developer conference is shaping up to be a more subdued affair than past WWDCs, partly due to Apple's unpreparedness in AI. At WWDC 2024, Apple unveiled Apple Intelligence—a suite of AI features—and announced a revamped Siri powered by ChatGPT. However, the rollout has been sluggish, the features are limited, and the promised Siri revamp has been delayed indefinitely. This has only widened the gap between Apple and its 'Magnificent Seven' peers, the world's top seven tech companies. Once the most valuable tech company in the world, Apple now sits in third place behind Microsoft and Nvidia. While competitors are developing and launching new AI features every month—and betting heavily on generative AI and AI agents—Apple has barely made a dent. Instead, it leaned on a partnership-driven strategy, which appears to have backfired, raising concerns on Wall Street and among investors about whether Apple can reclaim its former dominance. In recent years, Apple hasn't introduced any major breakthroughs, with the exception of the Vision Pro, a $3,500 mixed reality headset. But sales have been underwhelming, and developer interest has faded. Reports suggest the headset has sold fewer than 500,000 units, highlighting tepid consumer reception. Apple also faced a significant setback when it shut down its long-running autonomous car project. While the iPhone continues to generate billions and accounts for nearly half of Apple's annual revenue, the device is showing signs of innovation fatigue. Adding to the pressure, Apple's former design chief Jony Ive has joined OpenAI—one of the hottest AI companies in the world—and is working on a new type of AI hardware that could potentially challenge the iPhone's dominance. Investors are now questioning how long Apple can maintain consumer interest in the iPhone. While reaching new customers in developing markets like India and Indonesia may provide short-term gains, Apple's core platforms—the iPhone, Mac, and iPad—have matured. In recent years, Apple has successfully pivoted to services such as the App Store, Apple Music, and iCloud, but the performance of these services is still tightly linked to hardware sales. With little clarity on Apple's next big move, this year's developer conference is expected to focus primarily on the usual annual software updates, with the iPhone set to receive its biggest software redesign in years. Well, that now seems likely to be the case. Instead of iOS 19, the next version of the iPhone's operating system may be called iOS 26—and the same naming shift could apply to macOS 26 (the next Mac update will be a Lake Tahoe-themed), iPadOS 26, and watchOS 26. The rationale behind the name change isn't entirely clear, but it could be tied to the major software overhaul expected this year. WWDC is typically where Apple previews new software for its core platforms—the iPhone, iPad, Mac, Apple Watch, HomePod, and more. But for years, Apple has mostly introduced new features while leaving the interface and overall user experience largely unchanged. However, the buzz this year is that Apple is planning to unify the look of its iPhone, Mac, and iPad operating systems. Reportedly, the new interface will have a 'glass' aesthetic—possibly inspired by the Vision Pro's UI, which features a transparent look and rounded menus. The last major iPhone interface redesign was over a decade ago with iOS 7, so this overhaul is long overdue. In terms of new features, expect functionalities like the heavily rumored desktop mode, which would allow users to connect an iPhone with a USB-C port to an external display. New battery-saving features are also anticipated, and there's hope that Apple will introduce deeper integration between iPadOS and macOS, allowing the two platforms to increasingly mirror each other. One of the platforms that missed out on Apple Intelligence support last year was the Apple Watch—but hopefully, that will change this year. The big question is how Apple plans to integrate AI into the Apple Watch's interface. Not every Apple Intelligence feature makes sense on a device with such a small screen, but reportedly, Cupertino is working on incorporating generative AI insights into Health app data. There are also reports that Apple could be developing AI-powered medical services, which might launch in 2026. For years, there have been constant requests for Apple to penetrate deeper into gaming—whether by launching a game console, a gaming-focused streaming device, or perhaps a gaming Mac. All of these remain rumours, but Apple did launch a subscription service called Apple Arcade. It's a fun service that works across all Apple devices, but the game selection is pretty limited. With the Switch 2 launching soon (and the hype is astronomically high), Apple may introduce a new app that acts as a hub for games and could fold Apple Arcade into it, replacing the Game Center. Details are scant at the moment, but the idea of a dedicated gaming app that brings together the best games from the App Store and Apple Arcade makes a lot of sense—especially at a time when Apple is being questioned for its monopoly and tight grip on the App Store. Anuj Bhatia is a personal technology writer at who has been covering smartphones, personal computers, gaming, apps, and lifestyle tech actively since 2011. He specialises in writing longer-form feature articles and explainers on trending tech topics. His unique interests encompass delving into vintage tech, retro gaming and composing in-depth narratives on the intersection of history, technology, and popular culture. He covers major international tech conferences and product launches from the world's biggest and most valuable tech brands including Apple, Google and others. At the same time, he also extensively covers indie, home-grown tech startups. Prior to joining The Indian Express in late 2016, he served as a senior tech writer at My Mobile magazine and previously held roles as a reviewer and tech writer at Gizbot. Anuj holds a postgraduate degree from Banaras Hindu University. You can find Anuj on Linkedin. Email: ... Read More