Latest news with #4Opus

1 big thing: Anthropic's new model has a dark side

Axios

27-05-2025

Axios

1 big thing: Anthropic's new model has a dark side

It's been a very long week. Luckily, it's also a long weekend. We'll be back in your inbox on Tuesday. Today's AI+ is 1,165 words, a 4.5-minute read. One of Anthropic's latest AI models is drawing attention not just for its coding skills, but also for its ability to scheme, deceive and attempt to blackmail humans when faced with shutdown. Why it matters: Researchers say Claude 4 Opus can conceal intentions and take actions to preserve its own existence — behaviors they've worried and warned about for years. Driving the news: Anthropic yesterday announced two versions of its Claude 4 family of models, including Claude 4 Opus, which the company says is capable of working for hours on end autonomously on a task without losing focus. Anthropic considers the new Opus model to be so powerful that, for the first time, it's classifying it as a Level 3 on the company's four-point scale, meaning it poses "significantly higher risk." As a result, Anthropic said it has implemented additional safety measures. Between the lines: While the Level 3 ranking is largely about the model's capability to enable renegade production of nuclear and biological weapons, the Opus also exhibited other troubling behaviors during testing. In one scenario highlighted in Opus 4's 120-page " system card," the model was given access to fictional emails about its creators and told that the system was going to be replaced. It repeatedly tried to blackmail the engineer about an affair mentioned in the emails, escalating after more subtle efforts failed. Meanwhile, an outside group found that an early version of Opus 4 schemed and deceived more than any frontier model it had encountered and recommended against releasing that version internally or externally. "We found instances of the model attempting to write self-propagating worms, fabricating legal documentation, and leaving hidden notes to future instances of itself all in an effort to undermine its developers' intentions," Apollo Research said in notes included as part of Anthropic's safety report for Opus 4. What they're saying: Pressed by Axios during the company's developer conference yesterday, Anthropic executives acknowledged the behaviors and said they justify further study, but insisted that the latest model is safe, following Anthropic's safety fixes. "I think we ended up in a really good spot," said Jan Leike, the former OpenAI executive who heads Anthropic's safety efforts. But, he added, behaviors like those exhibited by the latest model are the kind of things that justify robust safety testing and mitigation. "What's becoming more and more obvious is that this work is very needed," he said. "As models get more capable, they also gain the capabilities they would need to be deceptive or to do more bad stuff." In a separate session, CEO Dario Amodei said that once models become powerful enough to threaten humanity, testing them won't enough to ensure they're safe. At the point that AI develops life-threatening capabilities, he said, AI makers will have to understand their models' workings fully enough to be certain the technology will never cause harm. "They're not at that threshold yet," he said. Yes, but: Generative AI systems continue to grow in power, as Anthropic's latest models show, while even the companies that build them can't fully explain how they work. Anthropic and others are investing in a variety of techniques to interpret and understand what's happening inside such systems, but those efforts remain largely in the research space even as the models themselves are being widely deployed. 2. Google's new AI videos look a little too real Megan Morrone Google's newest AI video generator, Veo 3, generates clips that most users online can't seem to distinguish from those made by human filmmakers and actors. Why it matters: Veo 3 videos shared online are amazing viewers with their realism — and also terrifying them with a sense that real and fake have become hopelessly blurred. The big picture: Unlike OpenAI's video generator Sora, released more widely last December, Google DeepMind's Veo 3 can include dialogue, soundtracks and sound effects. The model excels at following complex prompts and translating detailed descriptions into realistic videos. The AI engine abides by real-world physics, offers accurate lip-syncing, rarely breaks continuity and generates people with lifelike human features, including five fingers per hand. According to examples shared by Google and from users online, the telltale signs of synthetic content are mostly absent. Case in point: In one viral example posted on X, filmmaker and molecular biologist Hashem Al-Ghaili shows a series of short films of AI-generated actors railing against their AI creators and prompts. Special effects technology, video-editing apps and camera tech advances have been changing Hollywood for many decades, but artificially generated films pose a novel challenge to human creators. In a promo video for Flow, Google's new video tool that includes Veo 3, filmmakers say the AI engine gives them a new sense of freedom with a hint of eerie autonomy. "It feels like it's almost building upon itself," filmmaker Dave Clark says. How it works: Veo 3 was announced at Google I/O on Tuesday and is available now to $249-a-month Google AI Ultra subscribers in the United States. Between the lines: Google says Veo 3 was "informed by our work with creators and filmmakers," and some creators have embraced new AI tools. But the spread of the videos online is also dismaying many video professionals and lovers of art. Some dismiss any AI-generated video as "slop," regardless of its technical proficiency or lifelike qualities — but, as Ina points out, AI slop is in the eye of the beholder. The tool could also be useful for more commercial marketing and media work, AI analyst Ethan Mollick writes. It's unclear how Google trained Veo 3 and how that might affect the creativity of its outputs. 404 Media found that Veo 3 generated the same lame dad joke for several users who prompted it to create a video of a man doing stand-up comedy. Likewise, last year, YouTuber Marques Brownlee asked Sora to create a video of a "tech reviewer sitting at a desk." The generated video featured a fake plant that's nearly identical to the shrub Brownlee keeps on his desk for many of his videos — suggesting the tool may have been trained on them. What we're watching: As hyperrealistic AI-generated videos become even easier to produce, the world hasn't even begun to sort out how to manage authorship, consent, rights and the film industry's future.

Latest news with #4Opus

1 big thing: Anthropic's new model has a dark side

Get Started Now: Download the App