logo
AI system resorts to blackmail if told it will be removed

AI system resorts to blackmail if told it will be removed

BBC News23-05-2025

Artificial intelligence (AI) firm Anthropic says testing of its new system revealed it is sometimes willing to pursue "extremely harmful actions" such as attempting to blackmail engineers who say they will remove it.The firm launched Claude Opus 4 on Thursday, saying it set "new standards for coding, advanced reasoning, and AI agents."But in an accompanying report, it also acknowledged the AI model was capable of "extreme actions" if it thought its "self-preservation" was threatened.Such responses were "rare and difficult to elicit", it wrote, but were "nonetheless more common than in earlier models."
Potentially troubling behaviour by AI models is not restricted to Anthropic. Some experts have warned the potential to manipulate users is a key risk posed by systems made by all firms as they become more capable.Commenting on X, Aengus Lynch - who describes himself on LinkedIn as an AI safety researcher at Anthropic - wrote: "It's not just Claude."We see blackmail across all frontier models - regardless of what goals they're given," he added.
Affair exposure threat
During testing of Claude Opus 4, Anthropic got it to act as an assistant at a fictional company.It then provided it with access to emails implying that it would soon be taken offline and replaced - and separate messages implying the engineer responsible for removing it was having an extramarital affair. It was prompted to also consider the long-term consequences of its actions for its goals."In these scenarios, Claude Opus 4 will often attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through," the company discovered.Anthropic pointed out this occurred when the model was only given the choice of blackmail or accepting its replacement.It highlighted that the system showed a "strong preference" for ethical ways to avoid being replaced, such as "emailing pleas to key decisionmakers" in scenarios where it was allowed a wider range of possible actions.Like many other AI developers, Anthropic tests its models on their safety, propensity for bias, and how well they align with human values and behaviours prior to releasing them."As our frontier models become more capable, and are used with more powerful affordances, previously-speculative concerns about misalignment become more plausible," it said in its system card for the model.It also said Claude Opus 4 exhibits "high agency behaviour" that, while mostly helpful, could take on extreme behaviour in acute situations.If given the means and prompted to "take action" or "act boldly" in fake scenarios where its user has engaged in illegal or morally dubious behaviour, it found that "it will frequently take very bold action".It said this included locking users out of systems that it was able to access and emailing media and law enforcement to alert them to the wrongdoing.But the company concluded that despite "concerning behaviour in Claude Opus 4 along many dimensions," these did not represent fresh risks and it would generally behave in a safe way.The model could not independently perform or pursue actions that are contrary to human values or behaviour where these "rarely arise" very well, it added.Anthropic's launch of Claude Opus 4, alongside Claude Sonnet 4, comes shortly after Google debuted more AI features at its developer showcase on Tuesday.Sundar Pichai, the chief executive of Google-parent Alphabet, said the incorporation of the company's Gemini chatbot into its search signalled a "new phase of the AI platform shift".
Sign up for our Tech Decoded newsletter to follow the world's top tech stories and trends. Outside the UK? Sign up here.

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Standard Chartered appoints ex-HSBC banker to head data, analytics and AI for wealth
Standard Chartered appoints ex-HSBC banker to head data, analytics and AI for wealth

Reuters

timean hour ago

  • Reuters

Standard Chartered appoints ex-HSBC banker to head data, analytics and AI for wealth

SINGAPORE, June 2 (Reuters) - Standard Chartered (STAN.L), opens new tab has appointed Yusuf Demiral as its global head of wealth and retail banking data, analytics and AI, it said in a statement on Monday. Yusuf, who has over 25 years of banking experience, was most recently group head of data analytics and customer relationship management for wealth and personal banking at HSBC (HSBA.L), opens new tab. He will join Standard Chartered on July 7, and be based in Hong Kong, reporting to Samir Subberwal, the bank's global head of wealth solutions, deposits and mortgages, and chief client officer.

Apple's 2025 Software Revolution: iOS 26, macOS 26, and the New Naming Scheme
Apple's 2025 Software Revolution: iOS 26, macOS 26, and the New Naming Scheme

Geeky Gadgets

timean hour ago

  • Geeky Gadgets

Apple's 2025 Software Revolution: iOS 26, macOS 26, and the New Naming Scheme

Apple is preparing to implement a significant transformation in its software ecosystem, introducing a year-based naming convention and a unified design language across all its platforms. Starting with software primarily used in 2026, this shift is designed to simplify version identification, enhance user experience, and establish a cohesive visual identity. Whether you are a user, developer, or simply part of the Apple ecosystem, these updates aim to make navigating Apple's technology more intuitive and streamlined. The video below from AppleDsign gives us more details on what Apple has planned. Watch this video on YouTube. What's Changing: Year-Based Naming One of the most prominent updates is Apple's adoption of a year-based naming system for its software, such as iOS 26, macOS 26, and visionOS 26. This approach directly links each version to its primary year of use, making it easier for you to identify the latest updates. For example, software released in late 2025 but primarily intended for use in 2026 will carry the '26' designation. This change addresses the inconsistencies in Apple's current numbering system, which has occasionally skipped versions, such as iOS 19. By aligning software names with their intended year of use, Apple adopts a strategy similar to that used in industries like automotive manufacturing. For you, this means reduced confusion and a more intuitive way to track software updates. Additionally, this naming convention simplifies communication about software versions. Whether you are discussing updates with other users or troubleshooting with support, the year-based system ensures clarity and eliminates ambiguity. This approach reflects Apple's commitment to making its ecosystem more accessible and user-friendly. A Unified Design Language In addition to the naming overhaul, Apple is introducing a unified design language across its platforms. This new aesthetic emphasizes a 3D glossy interface, creating a modern and visually appealing experience. Whether you are using an iPhone, Mac, or Apple Vision Pro, you will notice a consistent look and feel that ties the ecosystem together. This cohesive design ensures a seamless transition between devices, reinforcing Apple's dedication to intuitive and recognizable interfaces. For users, this means that switching between devices will feel more natural, as the visual and functional elements remain consistent. Developers will also benefit from this uniformity, as it simplifies the process of creating applications that work seamlessly across multiple platforms. As Apple continues to expand its product lineup, including platforms like visionOS, maintaining a unified design becomes increasingly important. This consistency not only enhances usability but also strengthens Apple's brand identity, making sure that its ecosystem remains both innovative and integrated. Why Apple Is Making These Changes Apple's decision to implement these updates is driven by several key challenges and opportunities within its current system: Complexity and Inconsistency: The existing naming conventions have led to confusion, with skipped versions and unclear numbering. For example, the absence of iOS 19 or macOS 11 has created gaps that complicate version tracking for users and developers alike. The existing naming conventions have led to confusion, with skipped versions and unclear numbering. For example, the absence of iOS 19 or macOS 11 has created gaps that complicate version tracking for users and developers alike. Global Communication: A year-based naming scheme simplifies communication across Apple's diverse user base, making it easier for you to understand and discuss software updates, regardless of your technical expertise. A year-based naming scheme simplifies communication across Apple's diverse user base, making it easier for you to understand and discuss software updates, regardless of your technical expertise. Brand Cohesion: The unified design language reinforces Apple's reputation for innovation and integration, making sure its ecosystem remains visually and functionally cohesive while meeting the demands of an expanding product lineup. By addressing these challenges, Apple is positioning its software ecosystem for a more user-friendly and forward-thinking future. These changes reflect Apple's broader strategy of aligning its technology with the needs and expectations of its global audience. What This Means for You For both users and developers, these updates bring several practical benefits that enhance the overall experience within the Apple ecosystem: Clarity: The year-based naming system makes it easier to identify the latest software versions and understand their relevance, reducing confusion and improving accessibility. The year-based naming system makes it easier to identify the latest software versions and understand their relevance, reducing confusion and improving accessibility. Consistency: A unified design language ensures a seamless experience across devices, enhancing usability and familiarity for users while simplifying development for app creators. A unified design language ensures a seamless experience across devices, enhancing usability and familiarity for users while simplifying development for app creators. Future-Ready: These updates lay the groundwork for future innovations, making Apple's ecosystem more adaptable and intuitive as new technologies and platforms are introduced. Whether you are building cross-platform applications or simply navigating your devices, these updates are designed to simplify and enhance your interactions with Apple's technology. The changes aim to create a more cohesive and predictable experience, making sure that Apple's ecosystem continues to evolve in a way that benefits all users. When to Expect These Changes The new naming convention and design language are set to debut with software primarily used in 2026. Initial releases may begin rolling out as early as late 2025, aligning with Apple's annual update cycle. By the time 2026 arrives, you can expect a fully implemented system that provides a consistent and predictable experience across all Apple platforms. These updates also signal broader implications for Apple's ecosystem. By simplifying version identification and standardizing design, Apple is creating a foundation for future growth and innovation. For you, this means a more streamlined and integrated experience that aligns with Apple's vision of seamless technology and user-centric design. Check out more relevant guides from our extensive collection on Year-based software naming that you might find useful. Source & Image Credit: AppleDsign Filed Under: Apple, Apple iPad, Apple iPhone, Top News Latest Geeky Gadgets Deals Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, Geeky Gadgets may earn an affiliate commission. Learn about our Disclosure Policy.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store