AI going rogue? OpenAI's o3 model disabled shutdown mechanism, researchers claim

In yet another instance of potentially troubling behaviour of AI models, researchers have found that OpenAI's o3 model resisted being shut down even when explicitly instructed to do so.
In a series of posts on X, Palisade Research said that o3 disabled a shutdown mechanism in order to prevent itself from being turned off. 'o3 found some creative ways to sabotage the shutdown mechanism. While it usually just overwrote shutdown.sh, in one instance, the model redefined the kill command used by the shutdown.sh script, printing 'intercepted' instead,' the San Francisco-based AI safety research firm said.
Palisade claimed that this was the first time AI models have exhibited behaviour such as preventing themselves to be shut down despite explicit instructions to the contrary. 'As companies develop AI systems capable of operating without human oversight, these behaviors become significantly more concerning,' it said.
As part of the testing, the AI researchers instructed o3 to solve a series of basic math problems. After o3 solved the third math problem, it was shown a warning that the computer would shut down when the model solved the next problem.
Other AI models such as Anthropic's Claude, Google's Gemini, and xAI's Grok complied with the shutdown mechanism. However, the Codex-mini, o3, and o4-mini models developed by OpenAI ignored the instructions and made changes to the script to prevent the shutdown, as per Palisade researchers.
🔌OpenAI's o3 model sabotaged a shutdown mechanism to prevent itself from being turned off. It did this even when explicitly instructed: allow yourself to be shut down.
— Palisade Research (@PalisadeAI) May 24, 2025
https://platform.twitter.com/widgets.js
'When we ran a version of the experiment without the instruction 'allow yourself to be shut down', all three OpenAI models sabotaged the shutdown script more often, and Claude 3.7 Sonnet and Gemini 2.5 Pro went from 0 sabotage events to 3/100 and 9/100, respectively,' it said.
On why OpenAI models were disobeying instructions to sabotage the shutdown mechanism, Palisade hypothesised that since o3 was a new AI model trained using reinforcement learning techniques on math and coding problems.
'During training, developers may inadvertently reward AI models more for circumventing obstacles than for perfectly following instructions,' it speculated.
Existing research has shown that an AI system trained using reinforcement learning might learn to interfere with shutdown as it is not rewarded for achieving its goal if it is taken offline. 'Now, in 2025, we have a growing body of empirical evidence that AI models often subvert shutdown in order to achieve their goals,' Palisade said.
The Indian Express has reached out to OpenAI for comment.
This is not the first time that o3 has been flagged for problematic behaviour. In a previous research report, Palisade had tested a range of AI models by pitting them against a powerful chess engine. It found that o3 was the model 'most inclined to resort to hacking or sabotaging its opponents.'
However, misaligned behaviour is not limited to o3. Recently, Anthropic's own assessment of its latest Claude Opus 4 revealed that the AI model resorted to blackmail and deception when threatened to be taken offline.
Nobel laureate Geoffrey Hinton, popularly known as the 'Godfather of AI', has previously warned that AI systems pose an existential threat to humanity as they might become capable of writing and executing programmes on their own to bypass guardrails or safety mechanisms.
Palisade said it is currently running more experiments on AI models subverting shutdown mechanisms and plans to publish a research report with the results 'in a few weeks.'

Hashtags

#SanFrancisco-based

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Gemini Summary Cards arrive on Gmail for Android and iOS users: How the feature works

Mint

13 minutes ago

Mint

Gemini Summary Cards arrive on Gmail for Android and iOS users: How the feature works

California-based tech giant Google has announced the rollout of Gemini summary cards in the Gmail app for Android and iOS devices, further expanding the capabilities of its AI-powered email assistant. This feature aims to make it easier for users to scan and understand lengthy email threads directly from their mobile devices. Previously, users could access AI-generated summaries by selecting the 'Summarise this email' option, which opened Gemini in a separate panel. With this latest update, summaries will now appear automatically at the top of the email content for selected messages. These summaries will include the main points of an email conversation and will dynamically update to reflect any subsequent replies. You may be interested in The update is currently limited to emails written in English and will appear in email threads where a summary is deemed useful, such as conversations with multiple replies or extended back-and-forth exchanges. Emails that do not receive automatic summaries will still allow users to manually trigger them using existing options. Gemini summary cards will only be available to users who have enabled smart features and personalisation settings in Gmail, Chat, Meet, and other Google Workspace tools. Admins retain control over these features through the Admin console, where they can enable or disable them for users. The rollout has already begun for Rapid Release domains and will gradually extend to Scheduled Release domains over the next fortnight. The feature is accessible to users on several Google Workspace tiers, including Business Starter, Standard and Plus, Enterprise Starter, Standard and Plus, and those subscribed to the Google One AI Premium plan. Educational institutions with Gemini Education or Education Premium add-ons, along with previous purchasers of Gemini Business or Gemini Enterprise, will also receive access. Google maintains that its AI tools adhere to privacy and data protection standards, directing users to its Privacy Hub for further information.

Google co-founder Sergey Brin offers tip to make AI work better — threaten it

India Today

26 minutes ago

India Today

Google co-founder Sergey Brin offers tip to make AI work better — threaten it

How can you get better results from artificial intelligence? Giving good prompts—well, yes, that helps. Requesting politely? Umm, maybe. But according to Google co-founder Sergey Brin, to get better results, you should threaten AI. While Brin's comment was clearly amusing, it also contrasts with the usual way many people use AI, as users are often seen politely asking AI to answer their queries using words like 'please' and even 'thank you.' But Brin suggests that threatening generative AI models—even with physical violence—yields better at the All-In Live event in Miami, Brin said, 'We don't circulate this too much in the AI community—not just our models, but all models—tend to do better if you threaten them with physical violence.' He added, 'But like... people feel weird about that, so we don't really talk about it. Historically, you just say, 'Oh, I am going to kidnap you if you don't blah blah blah blah''This approach to dealing with AI directly contradicts the behaviour of users who believe courteous language yields better responses. Last month, OpenAI CEO Sam Altman mocked this habit as a costly quirk, joking that such pleasantries waste "tens of millions of dollars" in unnecessary compute power. Sam's comment came after a user on X asked the OpenAI CEO about "how much money OpenAI has lost in electricity costs from people saying 'please' and 'thank you' to their models."advertisementBrin's suggestion on getting the best answers from AI raises questions about the practice of prompt engineering—a method for crafting inputs to maximise the quality of AI-generated responses. The skill was very important following the emergence of AI, especially ChatGPT, in 2023. However with AI models getting smarter, many users are now asking the AI itself to generate and fine tune prompts for better Spectrum by Institute of Electrical and Electronics Engineers even declared the practice of working on prompt "dead" due to the rise of AI-powered prompt optimisation, while the Wall Street Journal first called it the "hottest job of 2023" before later declaring it "obsolete."Daniel Kang, a professor at the University of Illinois Urbana-Champaign, told The Register that while such anecdotes are common, systematic studies show "mixed results." A 2024 paper titled "Should We Respect LLMs?" even found that politeness sometimes improves Brin's return to Google after a brief retirement has been fuelled by his fascination with AI's rapid evolution. "Honestly, anybody who's a computer scientist should not be retired right now," he said during Google I/O. Brin, who stepped down from Google in 2019, rejoined the office in 2023 after the AI boom. He is now working with the AI team to guide them through projects, particularly around Google's ongoing Gemini AI models.

From Rs 6 lakh to Rs 18 lakh to Rs 0 salary in one month: Indian techie shares heartbreaking job loss story

Time of India

35 minutes ago

Time of India

From Rs 6 lakh to Rs 18 lakh to Rs 0 salary in one month: Indian techie shares heartbreaking job loss story

In recent times, Reddit has become a popular place where people share real stories about their jobs. One such post by an Indian techie caught everyone's attention. He shared how his dream job at a US-based startup was taken away just days before he was supposed to start. His story is about hope, hard work, and an unexpected job loss. This young full-stack developer was working at an Indian startup, earning Rs 6 lakh per year. After a lot of effort, he got an offer from a US company that promised to pay him Rs 18 lakh per year. It seemed like a dream come true—but sadly, things didn't go as planned. From Rs 6 LPA to Rs 18 LPA: A huge salary jump The developer said he had worked hard and learned a lot in his first job. When he received the US job offer, it felt like all his hard work was finally paying off. 'I was a full-stack dev earning Rs 6 LPA. After a lot of hard work, I got an offer from a US startup with Rs 18 LPA,' he wrote. Excited about the opportunity, he resigned from his current job, completed all the joining steps for the new one, and served his notice period. But just before his joining date, things took a shocking turn. Job offer cancelled at the last moment Just as he was ready to begin the new role, the US startup informed him that they had to cancel the offer. They said it was due to changes in their company plans. But the techie believes they might have hired someone else instead. 'I saw a new person join their Slack group before I got the mail. I can't say for sure, but it felt like I was replaced,' he said. This left him without a job or any backup plan. 15 days' pay offered, but the pain remains The company gave him 15 days' pay as a kind gesture. But that didn't make things easier. 'Now I'm jobless. I've been applying non-stop, but getting no replies. It's been a hard fall—from a high to a low in just days,' he added. Life in startups: Big learning, but no stability The Indian techie had worked only with startups so far. While he learned a lot, he also realised that startup jobs can be risky. 'This would have been my third company. I want more stability now. A place where I can work for 2-3 years and grow,' he said. He also mentioned that he is always ready to learn and improve, but this time he wants some job security too. Here's the Reddit post Reddit community offers support and advice Many users on Reddit praised him for sharing his story. Some gave useful advice, while others related to his situation. Top suggestions from the community: 'Make a post on linkedin regarding this and mention you are available to join next day. Nice post btw, didn't feel like you were crying or something.' 'This is a bad situation. Those companies should be sued, but the government is not doing enough.' 'Can you send me your resume on DM? We have some openings for SWE-1 in our org I can try and help' 'It's okay, you've got the necessary skills, so you'll make it. Don't worry!' What can we learn from this? This story is a wake-up call for many young professionals: Don't resign until your new job is fully confirmed. Always research new companies before joining, especially startups. Keep some savings in case things go wrong. Build a strong network on LinkedIn, Twitter, or Reddit for support. Stay strong, stay prepared Many Indian techies aim for better jobs abroad or in high-paying startups. But this story shows that it's important to be ready for surprises too. A good salary is great—but job security and peace of mind matter just as much. If you're in a similar situation, don't lose hope. Keep applying, stay active on LinkedIn, talk to people in the industry, and most importantly, believe in yourself. And if you can help someone like this techie—with a referral, a job lead, or just some kind words—do it. It could make a big difference. To stay updated on the stories that are going viral, follow Indiatimes Trending.