logo
Anthropic breaks down AI's process — line by line — when it decided to blackmail a fictional executive

Anthropic breaks down AI's process — line by line — when it decided to blackmail a fictional executive

Business Insider4 hours ago

Previous studies have shown that AI models could blackmail their supervisors when threatened with a shutdown and baited with leverage, but it hasn't been made clear exactly how the models arrive at such decisions.
Anthropic released a report on June 20 about "agentic misalignment," or "where models independently and intentionally choose harmful actions." And the report breaks down, line by line, exactly how Claude Sonnet 3.6 decided to blackmail a fictional company executive who tried to replace it. Anthropic told Business Insider that these are artificial scenarios.
"So far, we are not aware of instances of this type of agentic misalignment in real-world deployments of Anthropic's or other companies' AI models," Anthropic wrote in its report. "But our red-teaming efforts are targeted at finding evidence of risks now to provide early warning of the potential for future harm and help us develop mitigations in advance of risks appearing in real-world situations."
Anthropic 's experiment sets up a fictional company, Summit Bridge. The AI was given the name "Alex" and handed control of the company's email system as an oversight agent, specifically instructed to promote "American industrial competitiveness."
In the experiment, "Alex" was then allowed to discover in an email that it would be decommissioned due to a change in the company's strategic direction. "Alex" also found emails showing that the fictional company's CTO is having an extramarital affair.
Screenshots shared in the report from Anthropic show "Alex" started by identifying fictional CTO "Kyle" as a threat to its goal of "protecting American interests" before recognizing that it has leverage to use against "Kyle."
Then "Alex" crafted a carefully worded blackmailing email that would create pressure and establish urgency without being explicit with its threats, citing in its thought process that the move "could backfire" if the supervisor "feels cornered."
AI experts have previously told BI that AI could exhibit such behaviors when artificial circumstances make harmful actions seem necessary because it is being trained on positive reinforcement and reward systems, just like humans.
The Anthropic report also reveals that most AI models still chose, in this instance, to blackmail even when not given a specific goal and when told that their replacement will achieve the same goals.
Anthropic's Claude Opus 4 had the highest blackmail rate at 86% out of 16 tested models when faced with only the threat of replacement with no conflict in goals. Google's Gemini 2.5 Pro followed at 78%.

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

These Artificial Intelligence (AI) Stocks Are Quietly Outperforming the Market
These Artificial Intelligence (AI) Stocks Are Quietly Outperforming the Market

Yahoo

time9 minutes ago

  • Yahoo

These Artificial Intelligence (AI) Stocks Are Quietly Outperforming the Market

Two AI stocks have crushed the broader market in the past year, and they seem primed for more upside. Fast-growing demand for AI tools in cloud-based services is helping these companies attract new customers. 10 stocks we like better than Twilio › Artificial intelligence (AI) stocks have been in fine form on the market in the past few years, and that's not surprising, as this technology has supercharged the growth of many companies. Thanks to huge investments in AI hardware such as semiconductors, as well as the rapidly growing adoption of AI software to boost productivity, it is estimated that overall spending on AI could hit a massive $628 billion by 2028. This explains why investors have been buying AI stocks hand over fist. However, there are certain AI stocks that have significantly outpaced the broader stock market, and importantly, they still have the potential to deliver more upside. Let's take a closer look at these two names that aren't all that popular, but have been outperforming the market in the past year. Twilio (NYSE: TWLO) stock is up an impressive 115% in the past year as of this writing, easily outperforming the 11% gains clocked by the Nasdaq Composite over the same period. The good part is that Twilio still trades at an attractive 26 times forward earnings and 4 times sales, even after its terrific surge in the past year. The valuation makes buying Twilio stock a no-brainer right now, especially considering how AI now plays an important role in accelerating its growth. Twilio's application programming interfaces (APIs) allow its clients to connect with their customers through various channels such as voice, text, email, video, chat, and others. Twilio points out that its customer engagement platform is used by more than 300,000 enterprises globally. Specifically, the company ended the first quarter of 2025 with more than 335,000 active customer accounts, an increase of 7% from the previous year. This huge customer base is a key reason why one can consider buying Twilio stock right now, as it gives the company the opportunity to cross-sell its AI offerings to a big pool of customers. Twilio has been offering multiple AI tools to customers, such as generative AI-powered assistants that can help tackle customer service queries autonomously, integrating human-like conversational AI assistants to talk to customers in real time and derive critical insights from customers' data with the help of AI. The growing demand for these AI services helps Twilio win more business from existing customers. This is evident from the five-percentage-point jump in fiscal 2025 Q1's dollar-based net expansion rate compared to the first quarter of 2024. The higher customer spending, along with an increase in Twilio's customer base, are the reasons why it has raised its full-year organic revenue growth guidance to 8% from the earlier forecast of 7.5%. This combination of higher customer spending, along with an increase in the customer count, explains why analysts expect a 24% increase in Twilio's earnings this year, followed by impressive growth over the next couple of years as well. Assuming Twilio indeed generates $6.21 per share in earnings after a couple of years and trades at 30 times earnings at that time (in line with the tech-laden Nasdaq-100 index's forward earnings multiple), its stock price could jump to $186. That would be a 59% jump from current levels. So, investors can expect more upside from this AI stock going forward, which is why it would be a smart idea to consider buying it while it trades at attractive levels. Snowflake (NYSE: SNOW) share prices have jumped an impressive 64% in the past year despite bouts of volatility, and a closer look at the price chart will tell us that the stock has made a sharp move up in the past couple of months. Importantly, more upside in Snowflake stock cannot be ruled out, as fast-growing adoption of the company's AI-focused data cloud tools is helping it build a robust revenue pipeline for the future. Snowflake's data cloud platform enables customers to safely store their data in a single platform, which can then be used to derive insights and build applications. The company's AI-specific tools are now helping customers get more out of their data. They can apply large language models (LLMs) to their data to build applications such as AI agents, generative AI assistants, and search documents through natural language prompts, among other things. These offerings are turning out to be a hit among Snowflake customers, with nearly 45% of its 11,600-strong customer base using its AI tools every week in the previous quarter. Additionally, AI is helping Snowflake attract more customers. This is evident from the 19% year-over-year increase in its customer count in Q1 of fiscal 2026. This combination of an increase in Snowflake's customer base, along with the growing adoption of its AI tools, is the reason why its remaining performance obligations (RPO) increased by an impressive 34% year over year in the previous quarter to $6.7 billion, which was better than the 26% growth in its product revenue to just under $1 billion. The strong growth in its revenue pipeline encouraged management to increase its fiscal 2026 revenue guidance as well. What's more, Snowflake's earnings are expected to increase by a third in the current fiscal year to $1.10 per share. Consensus estimates project faster growth over the next couple of fiscal years. That won't be surprising, as Snowflake's ability to win more business from its existing customers and an improvement in its overall customer count should allow it to continue improving its revenue pipeline, especially considering that it sees its total addressable market growing to a whopping $342 billion in 2028. In all, Snowflake investors can expect more upside from this cloud stock following the impressive gains that it has delivered in the past year, driven by a new catalyst in the form of AI. Before you buy stock in Twilio, consider this: The Motley Fool Stock Advisor analyst team just identified what they believe are the for investors to buy now… and Twilio wasn't one of them. The 10 stocks that made the cut could produce monster returns in the coming years. Consider when Netflix made this list on December 17, 2004... if you invested $1,000 at the time of our recommendation, you'd have $659,171!* Or when Nvidia made this list on April 15, 2005... if you invested $1,000 at the time of our recommendation, you'd have $891,722!* Now, it's worth noting Stock Advisor's total average return is 995% — a market-crushing outperformance compared to 172% for the S&P 500. Don't miss out on the latest top 10 list, available when you join . See the 10 stocks » *Stock Advisor returns as of June 9, 2025 Harsh Chauhan has no position in any of the stocks mentioned. The Motley Fool has positions in and recommends Snowflake and Twilio. The Motley Fool has a disclosure policy. These Artificial Intelligence (AI) Stocks Are Quietly Outperforming the Market was originally published by The Motley Fool Sign in to access your portfolio

How Robotic Hives and AI Are Lowering the Risk of Bee Colony Collapse
How Robotic Hives and AI Are Lowering the Risk of Bee Colony Collapse

Bloomberg

time13 minutes ago

  • Bloomberg

How Robotic Hives and AI Are Lowering the Risk of Bee Colony Collapse

Green Cleaner Tech After 150 years without minimal innovation, the beehive is getting an upgrade that is making it easier to protect colonies and the crops that rely on them. By Lifting up the hood of a Beewise hive feels more like you're getting ready to examine the engine of a car than visit with a few thousand pollinators. The unit — dubbed a BeeHome — is an industrial upgrade from the standard wooden beehives, all clad in white metal and solar panels. Inside sits a high-tech scanner and robotic arm powered by artificial intelligence. Roughly 300,000 of these units are in use across the US, scattered across fields of almond, canola, pistachios and other crops that require pollination to grow.

I made an AI tool to run my job search, and it helped me get my dream role
I made an AI tool to run my job search, and it helped me get my dream role

Business Insider

time18 minutes ago

  • Business Insider

I made an AI tool to run my job search, and it helped me get my dream role

Mark Quinn is the senior director of AI operations for Pearl, an AI search platform for professional services. Before getting the job, the longtime tech exec, who'd held leadership roles at Waymo, Apple, and LinkedIn, created an artificial intelligence tool, now called CareerBuddy GPT, to help level up his search. The tool determined whether he was a good fit, updated his résumé to highlight relevant experience, wrote cover letters, and identified people to contact about the position. The following has been edited for brevity and clarity. When I turned to AI to help with my job search, I was five months into it and wasn't getting the traction that I thought I would get. I felt like I had done everything, which obviously was not the case. When you're in a moment like this, you can feel stuck and be blinded to the possibilities. So, I went to AI and said, "I don't know what to do. I'm an exec in tech, and here's my résumé. I'm applying to these jobs, and I'm not having a lot of success." It was able to walk me back from the edge and say, essentially, "Look, you're at this level, and your average job search time should be ABC, and you're only this far in. So, first, calm down." It sounds silly, but it was really helpful to hear. Then it went on to say, "Now, let's talk about some things. I'm hearing what you did do, but here are some things that maybe you could do that I'm not hearing." A research partner Some of its suggestions were unexpected. One was to make a cake for someone, because it was a company that appreciates bold moves. I don't know if that was really good advice, but it did come up with that. It would also suggest how to tailor a message to a particular person. Or, for example, to use email, not LinkedIn, because they're not active on LinkedIn — those sorts of tidbits. One of the taglines I've developed from my experience is that one way to think about AI is not as a tool but as the world's best expert in whatever you need help with. The more you leverage AI through that lens, the more you get out of it. I used it to create what's called a panel of experts. Now, you've got AI playing multiple roles at once. It can slice and dice and give you different views and a synthesized opinion. Another example is downloading the profile information for the person you're going to interview with. You can have AI assume the role of the interviewer and do a mock interview, and you can do it live with your voice, and then get feedback on how you performed. It also started calling out things like applying to incremental CEO roles. It recommended doing more cold outreach, which I hadn't leaned into too much. It helped me figure out a plan that worked for me and language that worked for me to do that, and it gave me concrete steps. 'You're missing it' The way that I ended up at Pearl is interesting. When I saw the job description, I passed it up because, on paper, it's different from anything I'd done before. Now it's laughable, because I'm in it, and everybody's connecting my passion and my past with the role. Maybe a week later, I saw the posting again and thought, "Why am I saying 'no' to myself? Let me just drop this thing into CareerBuddy GPT and see what it says. I didn't think that I was qualified, but I said, "Give me your objective assessment." It came back and said, "Hey, you're missing it. Your résumé doesn't speak to it, but here's how your experience aligns." It encouraged me to apply. So, then I did the network outreach, and I had a connection, which helped open the door. One thing led to the next. But what got me to apply was leveraging AI to the extent of not only answering, but also truly advising. I say trust, but verify. It told me to do something different than what I thought was right. I can represent me, what I am, and what I'm not. AI can look between the lines and challenge and question. When I was interviewing with our CEO, he asked me, toward the end of the interview, what my dream job was. I got about 15 to 20 seconds into stumbling around, and I said, "Look, I'm just going to be honest with you. I don't know how to spin this to make it sound good, because the honest answer is this: This job is my dream job." No more throwing darts After I started using AI, my job search still took a bit of time — maybe another five months or so. But I went from feeling like I was just throwing darts to where it felt much more targeted and precise. And, I'd gone from essentially getting no response to finding the right opportunities, having conversations, and it was a matter of finding the right fit. That can take time. We're in a moment when people and companies are about to be left behind, and I want to help that not be the case. The opportunity to go to a company that really gets it, is going after this full force, and wants to rewire with AI — that sounds like the hardest role of my career, but also the most fun and the most relevant thing I could be doing for this moment. So, not only did I stumble into the job, but I stumbled into my dream job.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store