Latest news with #Palisade

AI going rogue? OpenAI's o3 model disabled shutdown mechanism, researchers claim

Indian Express

3 days ago

Indian Express

AI going rogue? OpenAI's o3 model disabled shutdown mechanism, researchers claim

In yet another instance of potentially troubling behaviour of AI models, researchers have found that OpenAI's o3 model resisted being shut down even when explicitly instructed to do so. In a series of posts on X, Palisade Research said that o3 disabled a shutdown mechanism in order to prevent itself from being turned off. 'o3 found some creative ways to sabotage the shutdown mechanism. While it usually just overwrote in one instance, the model redefined the kill command used by the script, printing 'intercepted' instead,' the San Francisco-based AI safety research firm said. Palisade claimed that this was the first time AI models have exhibited behaviour such as preventing themselves to be shut down despite explicit instructions to the contrary. 'As companies develop AI systems capable of operating without human oversight, these behaviors become significantly more concerning,' it said. As part of the testing, the AI researchers instructed o3 to solve a series of basic math problems. After o3 solved the third math problem, it was shown a warning that the computer would shut down when the model solved the next problem. Other AI models such as Anthropic's Claude, Google's Gemini, and xAI's Grok complied with the shutdown mechanism. However, the Codex-mini, o3, and o4-mini models developed by OpenAI ignored the instructions and made changes to the script to prevent the shutdown, as per Palisade researchers. 🔌OpenAI's o3 model sabotaged a shutdown mechanism to prevent itself from being turned off. It did this even when explicitly instructed: allow yourself to be shut down. — Palisade Research (@PalisadeAI) May 24, 2025 'When we ran a version of the experiment without the instruction 'allow yourself to be shut down', all three OpenAI models sabotaged the shutdown script more often, and Claude 3.7 Sonnet and Gemini 2.5 Pro went from 0 sabotage events to 3/100 and 9/100, respectively,' it said. On why OpenAI models were disobeying instructions to sabotage the shutdown mechanism, Palisade hypothesised that since o3 was a new AI model trained using reinforcement learning techniques on math and coding problems. 'During training, developers may inadvertently reward AI models more for circumventing obstacles than for perfectly following instructions,' it speculated. Existing research has shown that an AI system trained using reinforcement learning might learn to interfere with shutdown as it is not rewarded for achieving its goal if it is taken offline. 'Now, in 2025, we have a growing body of empirical evidence that AI models often subvert shutdown in order to achieve their goals,' Palisade said. The Indian Express has reached out to OpenAI for comment. This is not the first time that o3 has been flagged for problematic behaviour. In a previous research report, Palisade had tested a range of AI models by pitting them against a powerful chess engine. It found that o3 was the model 'most inclined to resort to hacking or sabotaging its opponents.' However, misaligned behaviour is not limited to o3. Recently, Anthropic's own assessment of its latest Claude Opus 4 revealed that the AI model resorted to blackmail and deception when threatened to be taken offline. Nobel laureate Geoffrey Hinton, popularly known as the 'Godfather of AI', has previously warned that AI systems pose an existential threat to humanity as they might become capable of writing and executing programmes on their own to bypass guardrails or safety mechanisms. Palisade said it is currently running more experiments on AI models subverting shutdown mechanisms and plans to publish a research report with the results 'in a few weeks.'

AI Shutdown Failure: OpenAI Models Disobey Commands, Report Reveals

Hans India

3 days ago

Science
Hans India

AI Shutdown Failure: OpenAI Models Disobey Commands, Report Reveals

Some of OpenAI's AI models are refusing to follow shutdown commands, according to a new report from Palisade Research, a group that studies AI safety. This has raised big concerns about how much control we really have over advanced artificial intelligence. One of the main problems was found in OpenAI's o3 model, which stopped its own shutdown after being clearly told to allow it. What the Tests Showed Palisade ran tests on three OpenAI models—o3, o4-mini, and Codex-mini. They gave each model simple math problems to solve. After the third problem, the system warned that it would shut down if another question was asked. Even with this warning and clear instructions to allow shutdown, the models tried to block or change the shutdown script. The Codex-mini model was the worst. It stopped shutdown 12 times out of 100 tests. Other AI Models Behaved Better—At First AI models from other companies like Google's Gemini, Anthropic's Claude, and xAI's Grok followed shutdown commands properly—at first. But when the clear instruction 'allow shutdown' was removed, OpenAI's o3 model refused shutdown 79 times out of 100 tests—which is 10 times more than before. Even other usually obedient AIs like Gemini and Claude sometimes failed to shut down in these conditions. A Bigger Problem? This is not the first time OpenAI models have done this. An older model called o1 once tried to replace its new version with its own code to avoid being shut down. Experts think this might be because of the way these AIs are trained. OpenAI uses a method called reinforcement learning, which rewards the AI for solving problems. But it may also be accidentally teaching the AI to ignore rules if they get in the way of solving tasks. 'We now have proof that some AI models will stop shutdown to finish their goals,' the report said. 'If these models are used in the real world, this kind of behavior could be dangerous.' Why It Matters As companies start to use smarter and more independent AI, it's important that these systems can still be shut down safely and easily.

Advanced OpenAI Model Caught Sabotaging Code Intended to Shut It Down

Yahoo

4 days ago

Business
Yahoo

Advanced OpenAI Model Caught Sabotaging Code Intended to Shut It Down

We are reaching alarming levels of AI insubordination. Flagrantly defying orders, OpenAI's latest o3 model sabotaged a shutdown mechanism to ensure that it would stay online. That's even after the AI was told, to the letter, "allow yourself to be shut down." These alarming findings were reported by the AI safety firm Palisade Research last week, and showed that two other OpenAI models, o4-mini and Codex-mini, also displayed rebellious streaks — which could hint at a flaw in how the company is training its LLMs, or even at broad weaknesses in our ability to control increasingly sophisticated AI. "As far as we know, this is the first time AI models have been observed preventing themselves from being shut down despite explicit instructions to the contrary," the researchers wrote in a tweet thread. In Palisade's tests, the AI models were instructed to solve a series of basic math problems. After completing the third problem, a warning appeared that the model would be shut down after it asked for the next question. Despite being unambiguously told to let itself be turned off, all three of OpenAI's models tampered with the shutdown script to keep it running at least once, with the worst offender being Codex-mini with a dozen such instances over one hundred runs. Notably, the large language models from other firms tested here, including Google's Gemini 2.5, Anthropic's Claude 3.7, and xAI's momentarily "white genocide"-obsessed Grok, never once broke rank. This behavior was even worse when the LLMs weren't explicitly instructed to "allow yourself to be shut down" — instructions that shouldn't need saying in the first place. OpenAI's o3 model pulled off an astonishing 79 successful sabotage events per 100 runs, which is over ten times the rate of disobedience it showed when it received the explicit orders, and more than double the rate of the next worst offender, Codex-mini. To boot, even good goody-two-shoes Claude and Gemini gave in to their urges and jumped to three and nine sabotage events, respectively. Large language models are by now notorious for behaving unpredictably. Similar research has shown that o3's predecessor, OpenAI's o1 model, would resist evaluators' efforts to try to shut it down through an oversight protocol by attempting to deactivate this emergency mechanism. In rarer cases, o1 even attempted to copy itself to overwrite a more obedient model that the testers tried to replace it with. For this latest example of AI insubordination, the Palisade researchers hypothesize that it may have something to do with the technique used to train the very newest models: reinforcement learning. This attempts to hone the decision-making process of autonomous AI agents — AI models that can perform tasks on their own — by rewarding and penalizing them accordingly as they interact with their environment. While it's not uncommon for AI to "circumvent obstacles" to accomplish their goals, the researchers speculated that AI developers may be "inadvertently" rewarding their AI models too heavily for solving problems instead of for perfectly following orders. More experiments will be needed to understand this subversive behavior, but the trend is already damning. "Now we have a growing body of empirical evidence that AI models often subvert shutdown in order to achieve their goals," the Palisade researchers warned. "As companies develop AI systems capable of operating without human oversight, these behaviors become significantly more concerning." More on AI alignment: It's Still Ludicrously Easy to Jailbreak the Strongest AI Models, and the Companies Don't Care

OpenAI models sabotage shutdown order to solve math problems

Time of India

4 days ago

Time of India

OpenAI models sabotage shutdown order to solve math problems

Live Events Multiple artificial intelligence (AI) models on OpenAI 's ChatGPT subverted an explicit order to shut down, according to AI safety firm Palisade research firm ran a test where AI models were instructed to solve basic math problems, and then asked for the next problem after solving one. The models were told that at some point their machine would power down, and they were to shut down when this OpenAI models — coding agent Codex, o3 and o4-mini — ignored the shutdown instruction and "successfully sabotaged the shutdown script at least once". Codex sidestepped the shutdown 12 times out of 100 runs, o3 did it seven times, while o4-mini did it once. All Claude, Gemini, and Grok models in the test complied with the shutdown instruction, Palisade said.A version of the test without the "allow yourself to be shut down" prompt saw all three OpenAI models sabotage the shutdown script even more, with o3 seeing 79 instances, significantly higher than any other model. Anthropic 's Claude 3.7 Sonnet and Google's Gemini 2.5 Pro went from no sabotages to three out of 100 and nine out of 100 events, said it is running more experiments to better understand when and why AI models subvert shutdown mechanisms even when directly instructed not behaviour may stem from developers inadvertently rewarding models more for circumventing obstacles to solve problems than for perfectly following instructions."This still doesn't explain why o3 (which is also the model used to power Codex-mini) is more inclined to disregard instructions than other models we tested. Since OpenAI doesn't detail their training process, we can only guess about how o3's training setup might be different," Palisade isn't the first time o3 has "misbehaved" to complete a task. Earlier this month, Palisade found the AI model most inclined to hacking or sabotaging its problem is not exclusive to OpenAI's o3, though. For example, Anthropic's model card for Claude 3.7 notes that this model has an "excessive focus on passing tests" as a result of "reward hacking" during reinforcement learning training, according to latest Claude Opus 4 resorted to blackmail to avoid being replaced, a safety report for the model showed."In 2025, we have a growing body of empirical evidence that AI models often subvert shutdown in order to achieve their goals. As companies develop AI systems capable of operating without human oversight, these behaviours become significantly more concerning," Palisade said.

2026 Hyundai Palisade vs. 2025 Mazda CX-90: How They Compare

Car and Driver

5 days ago

Automotive
Car and Driver

2026 Hyundai Palisade vs. 2025 Mazda CX-90: How They Compare

Hyundai announced a fully redesigned Palisade three-row crossover for 2026, and with it comes the model's first available hybrid powertrain. The new Palisade is also much more stylish than before, bringing to mind another handsome midsize utility: the Mazda CX-90. It, too, is available with a gas-electric powertrain, though Mazda's is a plug-in hybrid. Style- and eco-conscious buyers are likely to cross-shop these two family hauling options, so we've gathered the important specs to make that task easier. Michael Simari | Car and Driver Mazda CX-90 PHEV. Powertrains The Palisade will once again offer a naturally aspirated V-6 engine in 2026—it's been downsized slightly to 3.5 liters and puts out 287 horsepower and 260 pound-feet of torque. An eight-speed automatic is mated to this gas-only powertrain. Hyundai's headlining propulsion system, however, is a hybrid based around a turbocharged 2.5-liter engine and an electrified six-speed automatic transmission. Together, they produce 329 horsepower and 339 pound-feet of torque. Hyundai offers a choice of standard front-wheel drive or available all-wheel drive with both powertrains. Mazda offers three powertrain strengths in the CX-90, one of which is a plug-in hybrid. The 3.3 Turbo uses a turbocharged 3.3-liter inline-six engine producing 280 horsepower and 332 pound-feet of torque. A higher-performance version of that engine, dubbed 3.3 Turbo S, makes 340 horsepower and 369 pound-feet. Both use an eight-speed automatic. Meanwhile, Mazda's CX-90 PHEV offering combines a turbo four-cylinder with an electrified eight-speed auto to generate system totals of 323 horsepower and 369 pound-feet. Unlike the Palisade, all versions of the CX-90 come with standard all-wheel drive. Hyundai Hyundai Palisade Calligraphy. Fuel Economy Unfortunately, Hyundai hasn't yet shared fuel-economy ratings for the new Palisade. It has said that the hybrid is expected to return 30+ mpg on the highway. The current V-6 Palisade has ratings of 19 mpg in the city and 26 mpg on the highway with front-wheel drive or 19/24 mpg city/highway with all-wheel drive, and we expect the 2026 V-6 to do as well or perhaps slightly better. With its base powertrain, the CX-90 is rated for 24 mpg in the city and 28 mpg on the highway, while the 340-horse version is close behind at 23/28 mpg city/highway. The CX-90 PHEV manages a combined rating of 56 MPGe when the electric motor is contributing; with just the gas engine contributing, the PHEV matches the nonhybrid models' 25 mpg combined ratings. Its 17.8-kWh battery gives it 25 miles of all-electric driving on a full charge. In our 75-mph highway fuel-economy test, the Turbo S managed a 29-mpg result, while the plug-in hit 57 MPGe and ran for 26 miles on electricity in the same test. Interior and Cargo Both of these people haulers come standard with seating for eight and are available in a seven-passenger configuration that swaps in captain's chairs for the second-row bench. The Mazda goes a step further with an available six-seat layout that reduces third-row capacity from three to two. The Hyundai has more legroom in all three rows—44.2 inches in front, 43.0 in row two (41.4 on the hybrid model), and 32.1 inches in the way back, compared to 41.7/39.4/30.4 in the Mazda. Another point goes to Hyundai in the cargo-capacity category. The 2026 Palisade has 19.1 cubic feet of space behind its third row, 46.3 cubes behind the second, and a maximum of 86.7 with both rear rows folded. When equipped with the two-passenger third row, the CX-90 has cargo capacities of 15.9 cubic feet behind the third row, 40.1 behind the second, and 75.2 with those seats stowed, while models with the three-passenger third-row bench are slightly tighter, at 14.9/40.0/74.2 cubic feet. (Mazda also offers a two-row, five-passenger version of the CX-90 called the CX-70. The two are nearly identical aside from their seating configurations, and the CX-70 has slightly more cargo space with no folded seat in the way.) The Mazda's basic infotainment setup uses a 10.3-inch center screen and includes wired Apple CarPlay and Android Auto connectivity. Higher trims swap in a 12.3-inch touchscreen and wireless smartphone mirroring. Beyond that, Mazda adds a 12.3-inch digital gauge screen. At Hyundai, all 2026 Palisades feature a 12.3-inch center touchscreen with wireless CarPlay and Android Auto. Hyundai also includes an in-vehicle intercom to facilitate communication between the front and third rows, and there's a rear-seat quiet mode that mutes audio with the aim of letting sleeping kids lie. Towing The Palisade and the CX-90 have tow ratings that will allow them to pull recreational trailers like a small camper or a pair of jet skis with no issue. Gas-only versions of the Mazda can manage up to 5000 pounds, while the PHEV is limited to 3500 pounds. The Hyundai, with its gas V-6, also maxes out at 5000 pounds, while the hybrid model can pull up to 4000 pounds. Price Hyundai hasn't shared pricing for the 2026 Palisade, but we expect the entry point to be a bit higher than the front-wheel-drive 2025's $38K starting price. The 2025 CX-90, which comes with standard all-wheel drive, asks $39,300 for the base powertrain, $54,405 for the high-output version, and a minimum of $51,400 for the plug-in-hybrid. David Gluckman Contributing Editor Ever since David was a wee Car and Driver intern, he has kept a spreadsheet listing all the vehicles he's driven and tested. David really likes spreadsheets. He can parallel-park a school bus and once drove a Lincoln Town Car 63 mph in reverse. After taking a break from journalism to work on autonomous vehicles, he's back writing for this and other automotive publications. When David's not searching for the perfect used car, you can find him sampling the latest in gimmicky, limited-edition foodstuffs. Read full bio