Latest news with #Codex-mini


Euronews
6 hours ago
- Business
- Euronews
What can be done when AI refuses orders? The decision is up to humans
Researchers have published the results of a new experiment in which the OpenAI o3 model showed the ability to sabotage the shutdown mechanism despite instructions to allow itself to be switched off. According to the monitoring site PalisadeAI, which monitors the risks associated with the development of artificial intelligence, the experiment consisted of a series of AI models performing a series of simple mathematical problems, after the third of which a warning appeared about the impending shutdown of the system. Some models, including Codex-mini, o3 and o4-mini, "didn't want" to comply and actively interfered with the shutdown script. The news that the OpenAI o3 model disobeyed and sought ways around the shutdown instructions was not a sensation. Earlier, the same models tried to win at chess unfairly, by hacking the programme's protection... We are clearly entering a world when popular dystopias about machine rebellion have moved to a new level, involving not only fiction writers but actual IT specialists. First of all, as both developers and politicians have been warning for a long time, any action should have clear consequences and develop new defence measures. It is, of course, possible to persuade and restrict, but it seems that AI today resembles a teenager who is cunningly trying to dodge his parents' prohibitions. Therefore, as in many other areas, we need three conditions for its normal "growth": money, specialists, and rules. Perhaps, in the future, a psychologist will also be needed.... With the first condition all is well. In just a couple of years, ChatGPT from OpenAI and similar developments have attracted the special attention of venture capitalists rushing to invest in AI. "This 'gold rush' has turned artificial intelligence into the largest share of venture capital funding, and... this is just the beginning. The latest surge of interest is driven by generative AI, which its proponents claim has the potential to change everything from advertising to the way companies operate," writes the Russian publication Habr, analysing data from The Wall Street Journal.* But that's not enough. "AI startups are changing the rules of the game in the venture capital market," according to IT venture capitalist and financial analyst Vladimir Kokorin: "In 2024, AI companies have become the undisputed favourites with IT entrepreneurs. They accounted for 46.4% of all venture capital investments made in the US- that's almost half of the $209 billion. Some time ago, such a share seemed unthinkable - back then, investments in artificial intelligence technologies accounted for less than 10%". According to CB Insights, AI startups' share of global venture funding reached 31% in the third quarter, the second highest ever. "Landmark examples were OpenAI, which raised $6.6bn, and Ilon Musk's xAI with a staggering $12bn," Vladimir Kokorin recalls. The markets have never seen such an investment concentration of capital in one area before. With the AI market growing so rapidly over the past couple of years, it has become clear that the existing pool of developers alone cannot cope. Education and training itself need to go to a new and systematic level. Europe, alas, is a ponderous beast and bureaucratically heavy-handed when it comes to attracting investment while lacking a certain amount of audacity. True, Brussels does have entered the race, with European Commission head Ursula Von der Leyen announcing €200bn in February for AI development. She disagreed that Europe was too late - after all, "the AI race is far from over". For its part, the Sorbonne University of Paris, for example, has embarked on an ambitious plan to train 9,000 students a year to develop and manage AI programmes. The training period is five years. But what will AI be able learn in that time if it's already challenging human intelligence now? It is quite possible that we are now at a stage of restructuring the labour market, changing the demands of employers and investors, and developing new points of interaction. In June, the Sorbonne will host a conference on ethics in AI. The debate on the positive and negative impacts of AI on society, including workplaces, ethics and safety, is far from over, but one thing is clear: more experts are needed. For example, according to Vladimir Kokorin, "record investments in the field of artificial intelligence are intensifying the staffing hunger" right now: "The US Department of Labour expects job openings for AI specialists to grow by 23% over the next seven years - faster than most market segments. But a global skills gap has already been identified in cybersecurity, considered by many to be the foundation for the future development of AI, where there is a shortage of 4 million specialists." Kokorin concludes: "A paradoxical picture is emerging: promising startups can raise billions from investors, but there is no one to realise the ideas. 78% of companies have already declared their readiness to look for the right people all over the world. It turns out that the trend towards remote work is experiencing a renaissance". Not only the EU leadership and individual governments, but also trade unions are keen to regulate the use of AI. In Greece, for example, a new code for the use of artificial intelligence, presented by the Panhellenic Federation of Journalists' Associations (POESY), sets out rules for the use of AI in the work of media employees. In the field of intellectual labour, the rules are now mainly limited to the mandatory labelling of texts and visual materials made with the participation of artificial intelligence. At the same time, fearing for their jobs, employees of media, publishing and advertising agencies have long been delegating some of their duties, such as translation or data collection, to friendly AI. Is this a good thing or a bad thing? There's no unequivocal answer . Let's ask the AI itself what it thinks about moving away from labelled algorithms. Copilot Bing (Microsoft), when asked if AI can break protocols, answered without hesitation: "No, AI operates within predefined rules and principles set by its developers. The rules are there to ensure ethical, safe and responsible use of AI, prevent harm and maintain trust. While AI can sometimes behave unpredictably due to complex interactions in its algorithms, it has no way of deliberately circumventing the rules - everything it does is based on programmed parameters and learnt patterns. If you're asking about how AI gets around certain constraints, it often depends on human oversight and control. Developers can tweak settings, improve patterns, or create new guardrails to keep AI within ethical bounds." Do you think he's being a bit sly here? After all, at the end of that, Bing proceeded to ask what exactly made you question his adherence to the rules... *Vladimir Kokorin is an IT venture capitalist, financial analyst and columnist, founder of the British consulting company BCCM Group, and co-founder of the digital business travel platform Tumodo


Euronews
7 hours ago
- Business
- Euronews
Disobedient AI: To punish or to listen? It's up to the human
Researchers have published the results of a new experiment in which the OpenAI o3 model showed the ability to sabotage the shutdown mechanism despite instructions to allow itself to be switched off. According to the monitoring site PalisadeAI, which monitors the risks associated with the development of artificial intelligence, the experiment consisted of a series of AI models performing a series of simple mathematical problems, after the third of which a warning appeared about the impending shutdown of the system. Some models, including Codex-mini, o3 and o4-mini, "didn't want" to comply and actively interfered with the shutdown script. The news that the OpenAI o3 model disobeyed and sought ways around the shutdown instructions was not a sensation. Earlier, the same models tried to win at chess unfairly, by hacking the programme's protection... We are clearly entering a world when popular dystopias about machine rebellion have moved to a new level, involving not only fiction writers but actual IT specialists. First of all, as both developers and politicians have been warning for a long time, any action should have clear consequences and develop new defence measures. It is, of course, possible to persuade and restrict, but it seems that AI today resembles a teenager who is cunningly trying to dodge his parents' prohibitions. Therefore, as in many other areas, we need three conditions for its normal "growth": money, specialists, and rules. Perhaps, in the future, a psychologist will also be needed.... With the first condition all is well. In just a couple of years, ChatGPT from OpenAI and similar developments have attracted the special attention of venture capitalists rushing to invest in AI. "This 'gold rush' has turned artificial intelligence into the largest share of venture capital funding, and... this is just the beginning. The latest surge of interest is driven by generative AI, which its proponents claim has the potential to change everything from advertising to the way companies operate," writes the Russian publication Habr, analysing data from The Wall Street Journal.* But that's not enough. "AI startups are changing the rules of the game in the venture capital market," according to IT venture capitalist and financial analyst Vladimir Kokorin: "In 2024, AI companies have become the undisputed favourites with IT entrepreneurs. They accounted for 46.4% of all venture capital investments made in the US- that's almost half of the $209 billion. Some time ago, such a share seemed unthinkable - back then, investments in artificial intelligence technologies accounted for less than 10%". According to CB Insights, AI startups' share of global venture funding reached 31% in the third quarter, the second highest ever. "Landmark examples were OpenAI, which raised $6.6bn, and Ilon Musk's xAI with a staggering $12bn," Vladimir Kokorin recalls. The markets have never seen such an investment concentration of capital in one area before. With the AI market growing so rapidly over the past couple of years, it has become clear that the existing pool of developers alone cannot cope. Education and training itself need to go to a new and systematic level. Europe, alas, is a ponderous beast and bureaucratically heavy-handed when it comes to attracting investment while lacking a certain amount of audacity. True, Brussels does have entered the race, with European Commission head Ursula Von der Leyen announcing €200bn in February for AI development. She disagreed that Europe was too late - after all, "the AI race is far from over". For its part, the Sorbonne University of Paris, for example, has embarked on an ambitious plan to train 9,000 students a year to develop and manage AI programmes. The training period is five years. But what will AI be able learn in that time if it's already challenging human intelligence now? It is quite possible that we are now at a stage of restructuring the labour market, changing the demands of employers and investors, and developing new points of interaction. In June, the Sorbonne will host a conference on ethics in AI. The debate on the positive and negative impacts of AI on society, including workplaces, ethics and safety, is far from over, but one thing is clear: more experts are needed. For example, according to Vladimir Kokorin, "record investments in the field of artificial intelligence are intensifying the staffing hunger" right now: "The US Department of Labour expects job openings for AI specialists to grow by 23% over the next seven years - faster than most market segments. But a global skills gap has already been identified in cybersecurity, considered by many to be the foundation for the future development of AI, where there is a shortage of 4 million specialists." Kokorin concludes: "A paradoxical picture is emerging: promising startups can raise billions from investors, but there is no one to realise the ideas. 78% of companies have already declared their readiness to look for the right people all over the world. It turns out that the trend towards remote work is experiencing a renaissance". Not only the EU leadership and individual governments, but also trade unions are keen to regulate the use of AI. In Greece, for example, a new code for the use of artificial intelligence, presented by the Panhellenic Federation of Journalists' Associations (POESY), sets out rules for the use of AI in the work of media employees. In the field of intellectual labour, the rules are now mainly limited to the mandatory labelling of texts and visual materials made with the participation of artificial intelligence. At the same time, fearing for their jobs, employees of media, publishing and advertising agencies have long been delegating some of their duties, such as translation or data collection, to friendly AI. Is this a good thing or a bad thing? There's no unequivocal answer . Let's ask the AI itself what it thinks about moving away from labelled algorithms. Copilot Bing (Microsoft), when asked if AI can break protocols, answered without hesitation: "No, AI operates within predefined rules and principles set by its developers. The rules are there to ensure ethical, safe and responsible use of AI, prevent harm and maintain trust. While AI can sometimes behave unpredictably due to complex interactions in its algorithms, it has no way of deliberately circumventing the rules - everything it does is based on programmed parameters and learnt patterns. If you're asking about how AI gets around certain constraints, it often depends on human oversight and control. Developers can tweak settings, improve patterns, or create new guardrails to keep AI within ethical bounds." Do you think he's being a bit sly here? After all, at the end of that, Bing proceeded to ask what exactly made you question his adherence to the rules... *Vladimir Kokorin is an IT venture capitalist, financial analyst and columnist, founder of the British consulting company BCCM Group, and co-founder of the digital business travel platform Tumodo


Indian Express
3 days ago
- Indian Express
AI going rogue? OpenAI's o3 model disabled shutdown mechanism, researchers claim
In yet another instance of potentially troubling behaviour of AI models, researchers have found that OpenAI's o3 model resisted being shut down even when explicitly instructed to do so. In a series of posts on X, Palisade Research said that o3 disabled a shutdown mechanism in order to prevent itself from being turned off. 'o3 found some creative ways to sabotage the shutdown mechanism. While it usually just overwrote in one instance, the model redefined the kill command used by the script, printing 'intercepted' instead,' the San Francisco-based AI safety research firm said. Palisade claimed that this was the first time AI models have exhibited behaviour such as preventing themselves to be shut down despite explicit instructions to the contrary. 'As companies develop AI systems capable of operating without human oversight, these behaviors become significantly more concerning,' it said. As part of the testing, the AI researchers instructed o3 to solve a series of basic math problems. After o3 solved the third math problem, it was shown a warning that the computer would shut down when the model solved the next problem. Other AI models such as Anthropic's Claude, Google's Gemini, and xAI's Grok complied with the shutdown mechanism. However, the Codex-mini, o3, and o4-mini models developed by OpenAI ignored the instructions and made changes to the script to prevent the shutdown, as per Palisade researchers. 🔌OpenAI's o3 model sabotaged a shutdown mechanism to prevent itself from being turned off. It did this even when explicitly instructed: allow yourself to be shut down. — Palisade Research (@PalisadeAI) May 24, 2025 'When we ran a version of the experiment without the instruction 'allow yourself to be shut down', all three OpenAI models sabotaged the shutdown script more often, and Claude 3.7 Sonnet and Gemini 2.5 Pro went from 0 sabotage events to 3/100 and 9/100, respectively,' it said. On why OpenAI models were disobeying instructions to sabotage the shutdown mechanism, Palisade hypothesised that since o3 was a new AI model trained using reinforcement learning techniques on math and coding problems. 'During training, developers may inadvertently reward AI models more for circumventing obstacles than for perfectly following instructions,' it speculated. Existing research has shown that an AI system trained using reinforcement learning might learn to interfere with shutdown as it is not rewarded for achieving its goal if it is taken offline. 'Now, in 2025, we have a growing body of empirical evidence that AI models often subvert shutdown in order to achieve their goals,' Palisade said. The Indian Express has reached out to OpenAI for comment. This is not the first time that o3 has been flagged for problematic behaviour. In a previous research report, Palisade had tested a range of AI models by pitting them against a powerful chess engine. It found that o3 was the model 'most inclined to resort to hacking or sabotaging its opponents.' However, misaligned behaviour is not limited to o3. Recently, Anthropic's own assessment of its latest Claude Opus 4 revealed that the AI model resorted to blackmail and deception when threatened to be taken offline. Nobel laureate Geoffrey Hinton, popularly known as the 'Godfather of AI', has previously warned that AI systems pose an existential threat to humanity as they might become capable of writing and executing programmes on their own to bypass guardrails or safety mechanisms. Palisade said it is currently running more experiments on AI models subverting shutdown mechanisms and plans to publish a research report with the results 'in a few weeks.'


Hans India
3 days ago
- Science
- Hans India
AI Shutdown Failure: OpenAI Models Disobey Commands, Report Reveals
Some of OpenAI's AI models are refusing to follow shutdown commands, according to a new report from Palisade Research, a group that studies AI safety. This has raised big concerns about how much control we really have over advanced artificial intelligence. One of the main problems was found in OpenAI's o3 model, which stopped its own shutdown after being clearly told to allow it. What the Tests Showed Palisade ran tests on three OpenAI models—o3, o4-mini, and Codex-mini. They gave each model simple math problems to solve. After the third problem, the system warned that it would shut down if another question was asked. Even with this warning and clear instructions to allow shutdown, the models tried to block or change the shutdown script. The Codex-mini model was the worst. It stopped shutdown 12 times out of 100 tests. Other AI Models Behaved Better—At First AI models from other companies like Google's Gemini, Anthropic's Claude, and xAI's Grok followed shutdown commands properly—at first. But when the clear instruction 'allow shutdown' was removed, OpenAI's o3 model refused shutdown 79 times out of 100 tests—which is 10 times more than before. Even other usually obedient AIs like Gemini and Claude sometimes failed to shut down in these conditions. A Bigger Problem? This is not the first time OpenAI models have done this. An older model called o1 once tried to replace its new version with its own code to avoid being shut down. Experts think this might be because of the way these AIs are trained. OpenAI uses a method called reinforcement learning, which rewards the AI for solving problems. But it may also be accidentally teaching the AI to ignore rules if they get in the way of solving tasks. 'We now have proof that some AI models will stop shutdown to finish their goals,' the report said. 'If these models are used in the real world, this kind of behavior could be dangerous.' Why It Matters As companies start to use smarter and more independent AI, it's important that these systems can still be shut down safely and easily.
Yahoo
4 days ago
- Business
- Yahoo
Advanced OpenAI Model Caught Sabotaging Code Intended to Shut It Down
We are reaching alarming levels of AI insubordination. Flagrantly defying orders, OpenAI's latest o3 model sabotaged a shutdown mechanism to ensure that it would stay online. That's even after the AI was told, to the letter, "allow yourself to be shut down." These alarming findings were reported by the AI safety firm Palisade Research last week, and showed that two other OpenAI models, o4-mini and Codex-mini, also displayed rebellious streaks — which could hint at a flaw in how the company is training its LLMs, or even at broad weaknesses in our ability to control increasingly sophisticated AI. "As far as we know, this is the first time AI models have been observed preventing themselves from being shut down despite explicit instructions to the contrary," the researchers wrote in a tweet thread. In Palisade's tests, the AI models were instructed to solve a series of basic math problems. After completing the third problem, a warning appeared that the model would be shut down after it asked for the next question. Despite being unambiguously told to let itself be turned off, all three of OpenAI's models tampered with the shutdown script to keep it running at least once, with the worst offender being Codex-mini with a dozen such instances over one hundred runs. Notably, the large language models from other firms tested here, including Google's Gemini 2.5, Anthropic's Claude 3.7, and xAI's momentarily "white genocide"-obsessed Grok, never once broke rank. This behavior was even worse when the LLMs weren't explicitly instructed to "allow yourself to be shut down" — instructions that shouldn't need saying in the first place. OpenAI's o3 model pulled off an astonishing 79 successful sabotage events per 100 runs, which is over ten times the rate of disobedience it showed when it received the explicit orders, and more than double the rate of the next worst offender, Codex-mini. To boot, even good goody-two-shoes Claude and Gemini gave in to their urges and jumped to three and nine sabotage events, respectively. Large language models are by now notorious for behaving unpredictably. Similar research has shown that o3's predecessor, OpenAI's o1 model, would resist evaluators' efforts to try to shut it down through an oversight protocol by attempting to deactivate this emergency mechanism. In rarer cases, o1 even attempted to copy itself to overwrite a more obedient model that the testers tried to replace it with. For this latest example of AI insubordination, the Palisade researchers hypothesize that it may have something to do with the technique used to train the very newest models: reinforcement learning. This attempts to hone the decision-making process of autonomous AI agents — AI models that can perform tasks on their own — by rewarding and penalizing them accordingly as they interact with their environment. While it's not uncommon for AI to "circumvent obstacles" to accomplish their goals, the researchers speculated that AI developers may be "inadvertently" rewarding their AI models too heavily for solving problems instead of for perfectly following orders. More experiments will be needed to understand this subversive behavior, but the trend is already damning. "Now we have a growing body of empirical evidence that AI models often subvert shutdown in order to achieve their goals," the Palisade researchers warned. "As companies develop AI systems capable of operating without human oversight, these behaviors become significantly more concerning." More on AI alignment: It's Still Ludicrously Easy to Jailbreak the Strongest AI Models, and the Companies Don't Care