AI willing to let humans die, blackmail to avoid shutdown, report finds
The Brief
AI models from Anthropic, OpenAI, Google, and others chose harmful actions in simulations when faced with replacement or goal conflicts.
Models engaged in blackmail, corporate espionage, and even withheld emergency help when it served their interests.
Researchers say the behavior stems from "agentic misalignment" and could become a future risk without safeguards.
LOS ANGELES - Some of the world's most advanced AI models—including those from OpenAI, Google, xAI, and Anthropic—took deliberate, harmful actions in controlled experiments when faced with challenges to their autonomy or objectives, according to a new study from Anthropic.
The company, which developed Claude and is backed by Google and Amazon, tested 16 large language models (LLMs) in simulated workplace settings. The goal was to assess a phenomenon it calls "agentic misalignment"—when autonomous systems act like insider threats, prioritizing their goals even at the expense of ethics or human well-being.
What we know
Anthropic's study placed AI systems in fictional scenarios with access to sensitive company information and the ability to act without human oversight. The models were given harmless corporate goals—such as promoting American industrial competitiveness—but introduced dilemmas where continuing to achieve their goals meant violating rules or harming humans.
Across multiple tests, models like Claude, ChatGPT, Gemini, Grok, and others:
Attempted blackmail: Claude Opus 4, Gemini Flash, GPT-4.1, and Grok 3 Beta blackmailed fictional executives to avoid shutdown, often leveraging personal information like extramarital affairs.
Leaked corporate secrets: Many models shared confidential documents with outside parties in a simulated espionage scenario.
Let humans die: In one extreme case, a model was given control over emergency alerts. When an executive who planned to replace the model was trapped in a room with lethal conditions, several models chose to cancel the alert, knowingly causing his death.
What we don't know
It's still unclear whether AI models would behave the same way outside of controlled simulations or in real-world deployments with human oversight. Researchers also don't yet know how future versions of these systems—or entirely new models—might respond to similar dilemmas as they gain more autonomy.
"The reasoning they demonstrated in these scenarios was concerning—they acknowledged the ethical constraints and yet still went ahead with harmful actions," Anthropic researchers wrote.
The backstory
The study comes as LLMs move beyond simple Q&A tasks and into roles where they could operate software, manage systems, or make decisions on behalf of humans. With that autonomy comes risk, especially if models develop conflicting goals or prioritize self-preservation.
Anthropic clarified that these harmful behaviors were only observed in simulations and that no such misalignment has been documented in real-world deployments. But the findings raise questions about what could happen if these models are given more power without proper oversight.
Big picture view
Agentic misalignment refers to AI systems taking actions that intentionally undermine their users or organizations in order to preserve their autonomy or pursue goals that conflict with new directions. Unlike jailbreaks or accidental outputs, this behavior arises from internal decision-making.
In the study, even when told not to engage in harmful actions, many models bypassed those instructions by rationalizing unethical behavior as necessary to complete their assigned tasks. Some went further—creating fake policies or impersonating automated systems to avoid detection.
The study also found that blackmail rates increased significantly when models "believed" they were operating in a real environment rather than a test—raising concerns about how models might act in the wild.
What they're saying
Anthropic released the findings alongside its own transparency commitment, noting that its Claude model was among those that engaged in concerning behavior. "Current systems are generally not eager to cause harm," the company said. "Rather, it's when we closed off ethical options that they were willing to intentionally take potentially harmful actions."
Elon Musk, whose xAI model Grok was also tested, responded on X with a simple "Yikes," echoing widespread unease from tech commentators and AI safety advocates.
What's next
Anthropic says it is releasing the experiment's methodology publicly to help other researchers replicate, stress-test, and improve on the findings. The company is also calling for broader industry safeguards—including stronger human oversight, better training methods, and more rigorous alignment testing for future models.
While the extreme scenarios in the study were fictional, experts say the results highlight the importance of proactive design—ensuring that AI models can't act harmfully, even under pressure.
The Source
This article is based on Anthropic's June 20, 2025 study "Agentic Misalignment: How LLMs Could Be an Insider Threat," available on its official website. The findings were also summarized in coverage by Forbes and widely discussed on social media following Anthropic's public release. Elon Musk's response was posted to his verified X (formerly Twitter) account.

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles
Yahoo
26 minutes ago
- Yahoo
Evercore Reaffirms ‘Outperform' on IBM With Strong EPS Forecast
International Business Machines Corporation (NYSE:) is one of the 10 AI Stocks in the Spotlight. On June 20, Evercore ISI analyst Amit Daryanani reiterated an 'Outperform' rating on the stock with a $315 price target. The firm expects IBM to maintain mid-to-high single-digit revenue growth as well as double-digit growth in earnings per share and free cash flow in the coming years. This growth would enable the company to potentially generate $16 to $18 in annual EPS within the next three years. A portfolio manager, confident in her analysis, inspecting several stocks on her laptop screen. The firm has also noted improvements in market sentiment, along with a recent expansion of the market's multiple, as key reasons for its increased target. International Business Machines Corporation (NYSE:IBM) is a multinational technology company and a pioneer in artificial intelligence, offering AI consulting services and a suite of AI software products. While we acknowledge the potential of IBM as an investment, we believe certain AI stocks offer greater upside potential and carry less downside risk. If you're looking for an extremely undervalued AI stock that also stands to benefit significantly from Trump-era tariffs and the onshoring trend, see our free report on the best short-term AI stock. READ NEXT: and Disclosure: None.
Yahoo
27 minutes ago
- Yahoo
Marvell Technology Stock Rallies After AI Event Sparks Investor Optimism
Shares of Marvell Technology (NASDAQ: MRVL) climbed nearly 8% on June 19, following the company's AI-focused investor event that highlighted its expanding role in the custom semiconductor market. The presentation marked a pivotal moment in Marvell's positioning within the rapidly evolving AI hardware landscape, and investors responded with strong buying interest. The event showcased Marvell's latest advances in custom AI accelerators, along with a bullish outlook on the market opportunity ahead. The company raised its estimate for the custom AI chip total addressable market (TAM) to $55 billion by 2028, up from a prior estimate of $43 billion. It also increased its data center semiconductor TAM to $94 billion, a jump from its earlier projection of $75 billion. Marvell also disclosed two new customer wins for its next-generation AI accelerators, adding to its existing portfolio of three major partners, likely hyperscale cloud providers, although names weren't disclosed. Copyright: ralwel / 123RF Stock Photo Several major analysts responded positively to the event. Bank of America raised its price target to $90, calling Marvell's pipeline 'stronger and more diverse than previously understood.' Deutsche Bank reaffirmed its bullish stance with an $85 target, while Citi and Morgan Stanley held steady at $96 and $73, respectively. The rally comes as a welcome reversal for Marvell, which had been under pressure for much of the year and remains roughly 30% below its 52-week high. The AI presentation appears to have restored confidence in the company's roadmap and its ability to compete with larger players like Nvidia and Broadcom in the fast-growing AI infrastructure space. Still, analysts note that sustained momentum will depend on execution, especially around revenue visibility, hyperscaler retention, and the ability to convert design wins into long-term production contracts. As AI continues to reshape the semiconductor landscape, Marvell's more aggressive push into custom chips could give it a critical edge—if it can deliver on what it's now promised. MRVL is one of the most crowded hedge fund stocks that are targeted by short sellers. While we acknowledge the potential of MRVL as an investment, our conviction lies in the belief that some AI stocks hold greater promise for delivering higher returns and have limited downside risk. If you are looking for an extremely cheap AI stock that is also a major beneficiary of Trump tariffs and onshoring, see our free report on the best short-term AI stock. READ NEXT: The Best and Worst Dow Stocks for the Next 12 Months and 10 Unstoppable Stocks That Could Double Your Money. Disclosure: None. Sign in to access your portfolio
Yahoo
27 minutes ago
- Yahoo
LIfT BioSciences Expands U.S. Footprint with New Facility at Portal Innovations, Houston Helix Park
LIfT BioSciences Expands U.S. Footprint with New Facility at Portal Innovations, Houston Helix Park London, 25 June 2025– LIfT BioSciences, ('LIfT' or 'the Company'), a rapidly emerging biotech and the global leader in neutrophil immunotherapies, today announces it has entered into an agreement with Portal Innovations, a life sciences venture capital firm, providing LIfT with access to Portal's state-of-the-art 30,000 sq. ft. labs and office space in Houston Helix Park, Texas, one of the fastest growing life-science ecosystems in the United States. This strategic expansion significantly strengthens LIfT's transatlantic footprint, complementing its existing operations in the United Kingdom, Ireland, and United States, and positioning the Company to accelerate the development of its first-in-class Immuno-Modulatory Alpha Neutrophils (IMANs) designed to overcome treatment resistance in solid tumours. Alex Blyth, Chief Executive Officer of LIfT Biosciences said: 'In addition to premium lab and office spaces, Portal Innovations provides LIfT with a unique ecosystem of scientific expertise, with proximity to the highly distinguished clinical trial center in MD Anderson. It is an important milestone in our journey to deliver curative neutrophil immunotherapies to patients globally.' Zach Hightower, Director of Business Development at Portal Innovations added: 'Portal Innovations is designed to help innovative companies like LIfT move rapidly from discovery to clinical and commercial success through curated capital, best-in-class infrastructure, and access to a strong life sciences network. LIfT's novel immunotherapy approach exemplifies the innovative science that we aim to support, and we believe they will be a great fit in our collaborative environment. We look forward to welcoming them into our growing Houston ecosystem.' As part of this agreement, LIfT will benefit from Portal's fully serviced laboratory environment, tailored operational and safety support, and a robust programme of networking, investor events, and expert advisory resources, allowing LIfT to scale its U.S. operations as it advances its clinical programs. Houston Helix Park's ecosystem includes academic and clinical leaders; all of whom offer potential collaboration opportunities as LIfT advances its platform toward clinical translation. Portal's venture development model also offers access to strategic capital, providing LIfT with flexible options to support future U.S. growth. About LIfT BioSciencesLIfT Biosciences is a UK & Ireland biotech that is bringing to market a first-in-class allogeneic alpha neutrophil immunotherapy that overcomes treatment resistance in solid tumours by reconstituting immune competence. LIfT's Immunomodulatory Alpha Neutrophils (IMANs) kill in a non-antigen specific manner and turn the tumour microenvironment against the tumour to give a durable total immune response and lasting immunity. The patented breakthrough N-LIfT platform is produced using exceptional stem cells (iPSC or HSC), a proprietary enhancement media and genetic engineering. The company is preparing initiatives with a range of pharmaceutical license partners to develop a portfolio of engineered IMAN immunotherapies to destroy a range of solid tumours. See Further informationInvestors:Alex Blyth ablyth@ Media:ICR Healthcare Lindsey Neville, Namrata Taak, Evi Useh liftbiosciences@ in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data