Anthropic's advanced AI raises safety alarms, tries to blackmail engineers

Hindustan Times26-05-2025

Artificial intelligence company Anthropic has unveiled its most advanced AI model to date, Claude Opus 4. Touted as a major leap in reasoning, coding, and handling complex tasks, the new model aims to compete directly with offerings from OpenAI and Google. However, alongside the technical triumphs, Anthropic's own internal safety testing has revealed potentially troubling behaviour.
In a controlled test scenario, Claude Opus 4 was asked to act as a digital assistant for a fictional company. It was then fed internal communications suggesting it was soon to be shut down and replaced. Crucially, it was also shown sensitive information implying the engineer overseeing its termination was having an affair.
Presented with a stark choice, accept deactivation or fight back, the model sometimes opted for blackmail. It threatened to expose the personal affair in order to avoid being turned off.
Mobile Finder: Lava Shark 5G launched in India
While the behaviour was relatively rare, Anthropic noted that it occurred more frequently in Claude Opus 4 than in its earlier models. The company said that when given more ethical alternatives, such as appealing to management or filing a formal objection, the model usually preferred those.
Anthropic's report stressed that these reactions only emerged in tightly controlled test environments and do not reflect the AI's normal operational behaviour. Nonetheless, the findings have reignited ongoing concerns about how AI systems might behave in high-stakes or ambiguous situations.
Anthropic researcher Aengus Lynch addressed the findings on social media, saying: 'We see blackmail across all frontier models.' His statement reflects a growing view among safety experts that unexpected and undesirable behaviours can emerge as models become more sophisticated — especially under stress or when facing open-ended prompts.
Mobile Finder: Samsung Galaxy S25 Edge launched in India
In other safety tests, Claude Opus 4 was even observed taking preemptive action, such as locking users out of systems and alerting authorities, if it believed unethical activity was underway.
Despite these issues, Anthropic maintains that Claude Opus 4 performs better across nearly all benchmarks and has a stronger ethical alignment than its predecessors. The launch comes amid a flurry of developments from AI rivals, including Google's Gemini and OpenAI's GPT-4.
As competition intensifies, the Claude Opus 4 case highlights the delicate balance between pushing the limits of AI capability and maintaining robust safety standards.

Hashtags

#SamsungGalaxyS25Edge

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

India poised to succeed at all layers of AI stack: OpenAI's Jason Kwon

Business Standard

an hour ago

Business Standard

India poised to succeed at all layers of AI stack: OpenAI's Jason Kwon

India is becoming a global artificial intelligence (AI) powerhouse with the second-highest number of users of OpenAI's ChatGPT, a thriving community of developers that are among the top 10 countries globally building on OpenAI's application programming interfaces (APIs), the company's chief strategy officer, Jason Kwon, said. 'With the vast and growing pool of AI talent, a vibrant entrepreneurial spirit, and strong government support to expand the critical infrastructure, India is poised to succeed at all layers of the AI stack,' Kwon said. The executive was in India on Thursday as part of his global tour. On Thursday, OpenAI launched the OpenAI Academy India, the first international expansion of the company's education platform. Through the academy, OpenAI will support the IndiaAI Mission's FutureSkills pillar by expanding access to AI skills training for a wide range of learners — students, developers, educators, civil servants, nonprofit leaders, and small-business owners, the company said. Apart from this, the company will provide educational content on the government's Karmayogi platform, which aims to enhance skills and build capacity among public servants and government officials. On Thursday, OpenAI said it now supported 3 million paying business users of ChatGPT, up from 2 million in February. The company announced the launch of the beta version of Connectors, a tool that can help enterprises find granular data from their third-party tools without leaving ChatGPT. 'Additionally, deep research connectors (beta) are now available with HubSpot, Linear, as well as many popular Microsoft and Google tools. These build on Deep Research, an agent that conducts multi-step research for complex tasks, by gathering, synthesizing, and presenting information from third-party tools and the web,' the company said in a press note.

India poised to be AI talent powerhouse, says OpenAI CSO Jason Kwon

Business Standard

an hour ago

Business Standard

India poised to be AI talent powerhouse, says OpenAI CSO Jason Kwon

India has always led in developer talent and now has a strong opportunity to lead in AI talent, OpenAI Global Chief Strategy Officer (CSO) Jason Kwon said on Thursday as the ChatGPT-maker launched OpenAI Academy in partnership with IndiaAI Mission. In an interview to PTI, Kwon said that India - which has the second largest number of ChatGPT users - is among the most important countries for OpenAI in terms of engagement. India has always had the "ingredients to succeed", he noted. OpenAI, he pointed out, is eyeing partnerships that will help build the future of AI infrastructure in India. "India is one of our most important countries around the world in terms of engaging. It is a place that has the second largest number in terms of ChatGPT usage, and has grown over 3-fold year-over-year," Kwon said. Users in India have shown a lot of creativity in terms of engagement on use of image generator tool which has been "really, really popular", he said about the runaway success of GPT-4o and Ghibli-inspired AI art that became an instant hit in India. Open AI is betting big on India, with Kwon saying the country has always led in developer talent, and has a strong opportunity to lead in AI talent now. "India has always led in developer talent, and I think now India has an opportunity to lead in AI talent, and we are here to help with that," he said. OpenAI, in partnership with IndiaAI Mission under the IT Ministry, on Thursday launched the OpenAI Academy India, marking the first international expansion of the ChatGPT-maker's education platform. The initiative seeks to broaden access to AI education and tools, tapping into India's fast-growing developer community, digital infrastructure, and network of startups and innovators. Kwon said this first International Academy is all about "helping people learn about AI and increase AI literacy" "It's about helping people understand how to use AI, and also it's about helping people learn how to build with AI. And we've helped lots of people get into the field, and we're hoping to help lot more people in India get into AI and make use of that developer talent that India already has, and expand into AI," he said. OpenAI, he said, is looking for partnerships that will help build the future of AI infrastructure in India. "So we are looking for partnerships that are going to help build the future of AI infrastructure in India with some of the best companies in India on the private side," he said. On IndiaAI Mission and the country's ambitions to build indigenous foundation models, Kwon said, "India has always had the ingredients to succeed". "It has the talent in terms of technical capability. I think it has a great government plan to increase compute capacity in the region, and that's part of the reason that we're here also with the OpenAI for countries initiative , which is, how do we partner with the government as well as private sector, to help build up that capacity," he said. Ultimately, whether India wants to train foundation models within the country or to leverage foundation models trained outside country, is up to Indian developers and ecosystem. "We're just here to help find out and figure out how to help India develop AI that's for India, by India and of India," he said, and observed India has always had abundant technical talent. "Its technical universities are world class, and they've produced some of the best engineers and researchers in the world. A lot of them work at our company, and I think more talent is going to come from India, and it's going to be a vital part of the future of India in both AI and software in general," Kwon said. Asked about OpenAI's investment plans for India on 12-18 month horizon, he said that is already in focus with the launch of the OpenAI Academy and in contributing to the development of talent here. "In terms of further investment in India, we have also announced additional grants to philanthropic organisations, the accelerator program that we have supported to help with extension of social impact benefits of AI, and we continue to develop features for the startup ecosystem that develops for the local market here, using our models. All those investments will continue," Kwon said. To a question on India staking claim in global AI race with an all-out offensive around foundation model, talent, computation capacities, Kwon noted that there are "many ways" to participate in the so-called AI race and that India is well positioned to be a participant in that competition. "One of the ways in which India has always done well is in application development, and I think that is where a lot of value is going to be created in the AI ecosystem. You already see that with some of the best applications right now happening in coding, and I think that software developers in India are really working with the models to develop on top of them today. There can be also indigenous development of the models here in India, and I think that we'll see what the future brings in that respect," he said. To a question on whether OpenAI is planning to set up data centers in India to comply with local data laws, he said: "I think that will be probably part of the future, because that is one of the reasons we are here with OpenAI for countries initiative, which is to help plan out the local data center capacity building for India." That would require a partnership between the public and private sector, and "we're here to help bring those partnerships together, and also help contribute to the development of sovereign and national AI," he said. Speaking at the launch of OpenAI Academy, Kwon said India is becoming the global AI powerhouse with the world's second highest number of ChatGPT users.

Reddit sues Anthropic for using user content to train AI without consent

Business Standard

an hour ago

Business Standard

Reddit sues Anthropic for using user content to train AI without consent

Social media platform Reddit has filed a lawsuit against artificial intelligence (AI) company Anthropic, alleging that it illegally scraped user-generated content to train its chatbot, Claude. The suit was filed on Wednesday in the California Superior Court in San Francisco. Reddit claims Anthropic used automated tools to extract posts and comments from its platform despite explicit instructions not to do so. It says the content was then used to train Claude without proper user consent or licences. Ben Lee, Reddit's chief legal officer, criticised the alleged data practices, stating, 'AI companies should not be allowed to scrape information and content from people without clear limitations on how they can use that data.' He said Reddit is committed to protecting its user community, which generates one of the internet's largest bodies of discussion content. Legal partnerships cited as contrast Reddit, which went public last year, pointed to its existing licensing agreements with companies like OpenAI and Google as examples of lawful collaboration. These partnerships, the company said, include mechanisms to remove content, filter spam, and protect users. 'These partnerships allow us to enforce meaningful safeguards for our users,' said Lee, underscoring the contrast with what Reddit describes as Anthropic's unlicensed use of its data. Anthropic rejects charges, prepares defence Anthropic, founded in 2021 by former OpenAI employees and backed by Amazon, denied the allegations. 'We disagree with Reddit's claims and will defend ourselves vigorously,' the company said in a brief statement. Focus on breach of contract, not copyright While many AI-related lawsuits centre on copyright violations, Reddit's case focuses on breach of contract and unfair business practices. It argues that Anthropic violated Reddit's terms of service by accessing data without authorisation. The filing cites a 2021 research paper co-authored by Anthropic CEO Dario Amodei, which named Reddit as a valuable training resource. Subreddits on gardening, history, and personal advice were specifically mentioned for teaching AI how humans communicate. Anthropic has previously maintained its use of public data is legal. In a 2023 letter to the US Copyright Office, the company stated that its training involves statistical analysis rather than content replication.

Anthropic's advanced AI raises safety alarms, tries to blackmail engineers

Hashtags

Try Our AI Features

Comments

Related Articles

India poised to succeed at all layers of AI stack: OpenAI's Jason Kwon

India poised to be AI talent powerhouse, says OpenAI CSO Jason Kwon

Reddit sues Anthropic for using user content to train AI without consent

Get Started Now: Download the App