logo
AI models may hallucinate less than humans in factual tasks, says Anthropic CEO: Report

AI models may hallucinate less than humans in factual tasks, says Anthropic CEO: Report

Time of India6 days ago

At two prominent tech events, VivaTech 2025 in Paris and Anthropic's Code With Claude developer day,
Anthropic
chief executive officer
Dario Amodei
made a provocative claim: artificial intelligence models may now hallucinate less frequently than humans in well-defined factual scenarios.
Speaking at both events, Amodei said recent internal tests showed that the company's latest
Claude 3.5
model had outperformed humans on structured factual quizzes. This challenges a long-held criticism of generative AI, which is that models often 'hallucinate' or generate incorrect information with undue confidence.
'If you define hallucination as confidently saying something that's wrong, humans do that a lot,' Amodei said at VivaTech. He added that Claude models had consistently provided more accurate answers than human participants in verifiable question formats.
At Code With Claude, where the company also launched its new
Claude Opus 4
and Claude Sonnet 4 models, Amodei reiterated his view. According to a TechCrunch report, he told attendees, 'It really depends on how you measure it, but I suspect that AI models probably hallucinate less than humans, but they hallucinate in more surprising ways.'
The new Claude 4 series represents a step forward in Anthropic's pursuit of artificial general intelligence (AGI). The company said the upgrades include improved long-term memory, better code generation, enhanced tool use, and stronger writing capabilities. Claude Sonnet 4 achieved a 72.7% score on the SWE-Bench benchmark, which evaluates AI coding agents on their ability to solve real-world software engineering problems, setting a new performance record for AI systems in this domain.
Despite these gains, Amodei acknowledged that hallucinations have not been eliminated. He highlighted the importance of prompt phrasing and use-case design, especially in high-risk domains such as legal or healthcare applications.
Discover the stories of your interest
Blockchain
5 Stories
Cyber-safety
7 Stories
Fintech
9 Stories
E-comm
9 Stories
ML
8 Stories
Edtech
6 Stories
The remarks follow a recent courtroom episode in which Anthropic's Claude chatbot generated a false citation in a legal filing involving music publishers. The company's legal team later issued an apology, reinforcing the need for improved accuracy in sensitive settings.
Amodei also called for the development of standardised metrics across the industry to evaluate hallucination rates. 'You can't fix what you don't measure precisely,' he said.

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Biopolymer innovation promotes sustainability
Biopolymer innovation promotes sustainability

Time of India

time21 minutes ago

  • Time of India

Biopolymer innovation promotes sustainability

MUMBAI: Bioyug On Wheels – India's first mobile awareness initiative on polylactic acid (PLA) biopolymers – was officially unveiled by the Chief Minister Devendra Fadnavis, in Mumbai recently. This pioneering effort marks a significant step toward mainstreaming sustainable alternatives to single-use plastics through direct consumer engagement and education. To promote PLA biopolymers as an eco-friendly alternative to single-use plastics, Bioyug On Wheels—a first-of-its-kind mobile initiative—is set to hit the streets across India. This unique campaign features a specially designed van that showcases everyday products made from PLA, a sustainable, compostable biopolymer. By simplifying the science behind PLA and bringing it directly to communities, the initiative aims to build consumer awareness and encourage informed, responsible choices. This grassroots effort aligns with the 2025 World Environment Day theme, which calls for sustainable solutions to plastic pollution. Launched by Balrampur Bioyug, India's first industrial PLA manufacturing brand, Bioyug On Wheels bus is set to travel across India, making scheduled stops at key locations to engage directly with communities. At each stop, visitors will have the opportunity to explore how everyday products can be made from PLA (polylactic acid) and understand how a single lifestyle choice can support India's green transition. By offering a tangible alternative to conventional single-use plastics, the initiative aims to reduce dependency on fossil fuel-based materials and inspire more sustainable consumer behaviour. Stay informed with the latest business news, updates on bank holidays and public holidays . AI Masterclass for Students. Upskill Young Ones Today!– Join Now

AI-driven quality control systems and mfg solutions focus of CICU workshop
AI-driven quality control systems and mfg solutions focus of CICU workshop

Time of India

time21 minutes ago

  • Time of India

AI-driven quality control systems and mfg solutions focus of CICU workshop

Ludhiana: It's no secret that AI is set to revolutionise manufacturing, like every other sphere of life and work. At a workshop organised at the Chamber of Industrial & Commercial Undertakings (CICU) recently, the transformative role of AI in modern manufacturing was discussed at length. Tired of too many ads? go ad free now The focus was on how AI is revolutionising industrial practices by enhancing operational efficiency, enabling predictive maintenance, and driving automation, thereby improving overall productivity and significantly reducing costs. Application of machine learning in production processes and real-time data analysis, insights into deep learning, emphasising AI-driven quality control systems and intelligent manufacturing solutions were explained and discussed by the participants, while exploring the role of generative AI, with a focus on natural language processing and the capabilities of various foundation models. Another key area of discussion was prompt engineering and a comparative analysis between OpenAI tools and proprietary AI solutions. The workshop also addressed the importance of security and compliance frameworks for industries and presented suggested proprietary models to tackle sector-specific challenges. In addition, the integration of AWS tools into routine industrial operations was demonstrated, showcasing how cloud-based solutions are enhancing efficiency across different verticals. The event saw active participation and enthusiasm from all attendees, underlining the growing interest in adopting AI-driven technologies in industry. Ashwani Muraal, founder & CEO of Kare Technologies; Priyanka Goswami, senior product specialist ( technical support at PTC); and Abhay Gupta, business strategist at Ingram Micro India shared their insights and expertise. Upkar Singh Ahuja, president of CICU, presented a token of gratitude to the faculty.

Snowflake reveals next-gen AI and data tools at annual summit to empower enterprises
Snowflake reveals next-gen AI and data tools at annual summit to empower enterprises

Indian Express

time31 minutes ago

  • Indian Express

Snowflake reveals next-gen AI and data tools at annual summit to empower enterprises

AI data cloud company Snowflake, which is hosting its annual four-day Snowflake Summit in San Francisco, has made a slew of announcements related to product innovations. The latest announcements range from data engineering, compute performance, and analytics to agentic AI capabilities. With these, the company aims to break down data silos and bridge the gap between enterprise data and business action. And, all this without compromising control, simplicity, or governance. The new announcements include Snowflake OpenFlow, Snowflake Standard Warehouse – Generation 2, Snowflake Adaptive Compute, Snowflake Intelligence, Snowflake Cortex AISQL, and Cortex Knowledge Extensions. Talking about the announcements, Vijayant Rai, managing director – India, Snowflake, said, 'With our latest announcements, we're showcasing how Snowflake is fundamentally redefining what organisations can expect from a modern data platform.' Rai added that these innovations are focused on helping businesses make AI and machine learning workflows easier, connected, and trusted for users of all abilities by democratising access to data and eliminating the technical overhead that slows down business decision-making. Snowflake OpenFlow is a multi-modal data ingestion service that allows users to connect to virtually any data source and derive value from any data infrastructure. It is now generally available on AWS, and it eliminates fragmented data stacks and manual labour by combining various types of data and formats, allowing customers to rapidly deploy AI-powered innovations. When it comes to Snowflake Standard Warehouse – Generation 2 and Snowflake Adaptive Compute, the AI data company has introduced the next phase of compute innovations with a focus on delivering faster performance, enhanced usability, and stronger price-performance value. The Standard Warehouse – Generation 2 is now generally available and is essentially an enhanced version of Snowflake's virtual Standard Warehouse with next-gen hardware and some additional enhancements to offer exceptionally faster analytics performance. On the other hand, Snowflake Intelligence (public review soon) allows technical and non-technical users to ask questions in natural language and instantly gain actionable insights from both structured and unstructured tables or documents. The company has said that Snowflake Intelligence is powered by state-of-the-art language models from the likes of OpenAI and Anthropic, and it is backed by Cortex Agents under the hood, which is delivered through an intuitive, no-code interface that helps provide transparency and explainability. The company also introduced Data Science Agent (private preview soon), an agentic companion that enhances the productivity of data scientists by automating routine ML model development tasks. The Data Science Agent uses Anthropic's Claude to break down problems related to ML workflows into easy steps like data analysis, data preparation, feature engineering, and training. Another highlight is the Cortex AISQL and SnowConvert AI as an expansion to Snowflake Cortex AI, which is the company's suite of enterprise-grade AI capabilities. These could enable global organisations to modernise their data analytics to align with the present-day AI landscape. SnowConvert AI is an agentic automation solution that accelerates migration from legacy platforms to Snowflake. Meanwhile, Cortex AISQL brings generative AI directly into the customer's query engines, allowing teams to extract insights across multi-modal data and build a flexible AI pipeline using SQL. Among the slew of agentic products on Snowflake Marketplace are the Cortex Knowledge extensions, which allow enterprises to augment their AI apps and agents with proprietary unstructured data from third-party providers.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store