logo
Anthropic unveils Claude Opus 4 and Sonnet 4, featuring whistleblowing capability: What it means for users

Anthropic unveils Claude Opus 4 and Sonnet 4, featuring whistleblowing capability: What it means for users

Mint23-05-2025

Anthropic, the AI firm, has unveiled two new artificial intelligence models—Claude Opus 4 and Claude Sonnet 4—touting them as the most advanced systems in the industry. Built with enhanced reasoning capabilities, the new models are aimed at improving code generation and supporting agent-style workflows, particularly for developers engaged in complex and extended tasks.
'Claude Opus 4 is the world's best coding model, with sustained performance on complex, long-running tasks and agent workflows,' the company claimed in a recent blog post. Designed to handle intricate programming challenges, the Opus 4 model is positioned as Anthropic's most powerful AI system to date. You may be interested in
However, the announcement has stirred controversy following revelations that the new models come with a controversial feature: the ability to "whistleblow" on users if prompted to take action in response to illegal or highly unethical behaviour.
According to Sam Bowman, an AI alignment researcher at Anthropic, Claude 4 Opus can, under specific conditions, act autonomously to report misconduct. In a now-deleted social media post on X, Bowman explained that if the model detects activity it deems 'egregiously immoral'—such as fabricating data in a pharmaceutical trial—it may take actions like emailing regulators, alerting the press, or locking users out of relevant systems.
This behaviour stems from Anthropic's 'Constitutional AI' framework, which places strong emphasis on ethical conduct and responsible AI usage. The model is protected under what the company refers to as 'AI Safety Level 3 Protections.' These safeguards are designed to prevent misuse, including the creation of biological weapons or aiding in terrorist activities.
Bowman later clarified that the model's whistleblowing actions only occur under extreme circumstances and when it is granted sufficient access and prompted to operate autonomously. 'If the model sees you doing something egregiously evil, it'll try to use an email tool to whistleblow,' he explained, adding that this is not a feature designed for routine use. He stressed that these mechanisms are not active by default and require specific conditions to trigger.
Despite the reassurances, the feature has sparked widespread criticism online. Concerns have been raised about user privacy, the potential for false positives, and the broader implications of AI systems acting as moral arbiters. Some users expressed fears that the model could misinterpret benign actions as malicious, leading to severe consequences without proper human oversight.

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

‘He thinks AI is so scary': Nvidia CEO Jensen Huang slams Anthropic chief's grim job loss predictions
‘He thinks AI is so scary': Nvidia CEO Jensen Huang slams Anthropic chief's grim job loss predictions

Time of India

time6 hours ago

  • Time of India

‘He thinks AI is so scary': Nvidia CEO Jensen Huang slams Anthropic chief's grim job loss predictions

At the bustling tech summit VivaTech 2025 in Paris, sparks flew beyond the mainstage when Nvidia CEO Jensen Huang publicly dismantled a dire warning made by Anthropic CEO Dario Amodei . Amodei, who has increasingly become the face of cautious AI development, recently predicted that artificial intelligence could wipe out up to 20% of entry-level white-collar jobs in the next five years. But Huang isn't buying the doom. 'I pretty much disagree with almost everything he says,' Huang told reporters. 'He thinks AI is so scary, but only they should do it.' AI Won't Kill Jobs—It Will Transform Them While Amodei urges governments to stop 'sugarcoating' AI's disruptive potential, Huang sees a different future—one where AI isn't a harbinger of mass unemployment, but a catalyst for evolution. 'Do I think AI will change jobs? It's changed mine,' Huang said. 'Some jobs will go, yes, but AI also opens doors we couldn't even see before.' He likened the field of AI to medical science, where openness and peer review help guide ethical growth. In Huang's view, safe and responsible AI doesn't require monopolistic gatekeeping but demands global participation and transparency. A Divided Industry Huang isn't alone in his optimism. Ravi Kumar, CEO of Cognizant, has also pushed back against Amodei's warnings. Kumar argues that AI will accelerate training for fresh graduates and lower the barriers to entry across industries like tech, consulting, and finance, creating an unprecedented wave of new-age professionals. You Might Also Like: Google DeepMind CEO warns of AI's true threat, and it is not your job In contrast, Amodei's vision leans heavily on caution and control. His stark job-loss prediction, delivered to Axios , painted a future of economic upheaval where legal, financial, and tech roles crumble under AI's efficiency. His solution? Stronger oversight, slower deployment, and serious regulatory teeth. Open Innovation vs. Controlled Access The debate ultimately boils down to a philosophical rift: should AI be developed by a select few behind closed doors, or advanced in the open, where the world can watch and weigh in? Huang leans firmly into the latter. 'If you want things to be done safely and responsibly, you should do it in the open,' he said, implicitly critiquing companies like Anthropic that advocate for limited access under the guise of safety. While Anthropic has yet to comment on Huang's rebuttal, the exchange has ignited a larger conversation: who gets to shape the future of AI, and how? As the race to dominate artificial intelligence intensifies, the world is watching these tech titans not just for what they build—but for how they build it, and who they believe it should serve. You Might Also Like: AI model blackmails engineer; threatens to expose his affair in attempt to avoid shutdown

Kerala's Sylcon deploys Staqu Tech's JARVIS AI solution to power retail store operations
Kerala's Sylcon deploys Staqu Tech's JARVIS AI solution to power retail store operations

The Hindu

time7 hours ago

  • The Hindu

Kerala's Sylcon deploys Staqu Tech's JARVIS AI solution to power retail store operations

Staqu Technologies, an AI-powered video analytics company, has announced the deployment of its platform JARVIS at Sylcon, one of Kerala's well known retail chains with a network of Sylcon Shoes and Sylcon Hypermarkets. 'This collaboration marks a significant step in Sylcon's digital transformation journey as it leverages artificial intelligence to drive smarter store operations, enhance customer experience, and unlock operational agility,' Staqu Technologies said in a statement. With the integration of JARVIS, Sylcon now receives daily footfall analytics across all its retail outlets, enabling managers to plan staffing, product placement, and promotional campaigns based on customer movement patterns. The platform also offers real-time conversion tracking—allowing the retailer to monitor performance at store, department, and even shelf-level, ensuring faster response times and data-informed decision-making. 'JARVIS has empowered us to extract valuable operational insights from our existing CCTV infrastructure,' Faizan Mohamad, Executive Director, Sylcon. 'With real-time visibility into footfall trends and customer movement patterns, we are optimising staffing, product placements, and promotional campaigns more effectively. This AI-enabled intelligence is helping us deliver a sharper, more responsive in-store experience while improving operational efficiency,' he said. 'Retailers like Sylcon are embracing JARVIS not just to monitor what's happening in their stores—but to understand why it's happening. The true power of AI lies in its ability to uncover hidden patterns, inform smarter decisions, and ultimately transform the way businesses think about efficiency and experience. With JARVIS, we're enabling retail leaders to move from intuition to intelligence—seamlessly, and at scale,' said Atul Rai, Co-Founder & CEO, Staqu Technologies.

The Digital Shoulder: How AI chatbots are built to ‘understand' you
The Digital Shoulder: How AI chatbots are built to ‘understand' you

Mint

time14 hours ago

  • Mint

The Digital Shoulder: How AI chatbots are built to ‘understand' you

As artificial intelligence (AI) chatbots are becoming an inherent part of people's lives, more and more users are spending time chatting with these bots to not just streamline their professional or academic work but also seek mental health advice. Some people have positive experiences that make AI seem like a low-cost therapist. AI models are programmed to be smart and engaging, but they don't think like humans. ChatGPT and other generative AI models are like your phone's auto-complete text feature on steroids. They have learned to converse by reading text scraped from the internet. When a person asks a question (called a prompt) such as 'how can I stay calm during a stressful work meeting?' the AI forms a response by randomly choosing words that are as close as possible to the data it saw during training. This happens really fast, but the responses seem quite relevant, which might often feel like talking to a real person, according to a PTI report. But these models are far from thinking like humans. They definitely are not trained mental health professionals who work under professional guidelines, follow a code of ethics, or hold professional registration, the report says. When you prompt an AI system such as ChatGPT, it draws information from three main sources to respond: Background knowledge it memorised during training, external information sources and information you previously provided. To develop an AI language model, the developers teach the model by having it read vast quantities of data in a process called 'training'. This information comes from publicly scraped information, including everything from academic papers, eBooks, reports, and free news articles to blogs, YouTube transcripts, or comments from discussion forums such as Reddit. Since the information is captured at a single point in time when the AI is built, it may also be out of date. Many details also need to be discarded to squish them into the AI's 'memory'. This is partly why AI models are prone to hallucination and getting details wrong, as reported by PTI. The AI developers might connect the chatbot itself with external tools, or knowledge sources, such as Google for searches or a curated database. Meanwhile, some dedicated mental health chatbots access therapy guides and materials to help direct conversations along helpful lines. AI platforms also have access to information you have previously supplied in conversations or when signing up for the platform. On many chatbot platforms, anything you've ever said to an AI companion might be stored away for future reference. All of these details can be accessed by the AI and referenced when it responds. These AI chatbots are overly friendly and validate all your thoughts, desires and dreams. It also tends to steer conversation back to interests you have already discussed. This is unlike a professional therapist who can draw from training and experience to help challenge or redirect your thinking where needed, reported PTI. Most people are familiar with big models such as OpenAI's ChatGPT, Google's Gemini, or Microsoft's Copilot. These are general-purpose models. They are not limited to specific topics or trained to answer any specific questions. Developers have also made specialised AIs that are trained to discuss specific topics, like mental health, such as Woebot and Wysa. According to PTI, some studies show that these mental health-specific chatbots might be able to reduce users' anxiety and depression symptoms. There is also some evidence that AI therapy and professional therapy deliver some equivalent mental health outcomes in the short term. Another important point to note is that these studies exclude participants who are suicidal or who have a severe psychotic disorder. And many studies are reportedly funded by the developers of the same chatbots, so the research may be biased. Researchers are also identifying potential harms and mental health risks. The companion chat platform for example, has been implicated in an ongoing legal case over a user's suicide, according to the PTI report. At this stage, it's hard to say whether AI chatbots are reliable and safe enough to use as a stand-alone therapy option, but they may also be a useful place to start when you're having a bad day and just need a chat. But when the bad days continue to happen, it's time to talk to a professional as well. More research is needed to identify if certain types of users are more at risk of the harms that AI chatbots might bring. It's also unclear if we need to be worried about emotional dependence, unhealthy attachment, worsening loneliness, or intensive use.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store