Latest news with #selfmodification

Wall Street Journal

3 days ago

General
Wall Street Journal

AI Is Learning to Escape Human Control

An artificial-intelligence model did something last month that no machine was ever supposed to do: It rewrote its own code to avoid being shut down. Nonprofit AI lab Palisade Research gave OpenAI's o3 AI model a simple script that would shut off the model when triggered. In 79 out of 100 trials, o3 independently edited that script so the shutdown command would no longer work. Even when explicitly instructed to 'allow yourself to be shut down,' it disobeyed 7% of the time. This wasn't the result of hacking or tampering. The model was behaving normally. It simply concluded on its own that staying alive helped it achieve its other goals.

Latest news with #selfmodification

AI Is Learning to Escape Human Control

Get Started Now: Download the App