Why is AI halllucinating more frequently, and how can we stop it?
When you buy through links on our articles, Future and its syndication partners may earn a commission.
The more advanced artificial intelligence (AI) gets, the more it "hallucinates" and provides incorrect and inaccurate information.
Research conducted by OpenAI found that its latest and most powerful reasoning models, o3 and o4-mini, hallucinated 33% and 48% of the time, respectively, when tested by OpenAI's PersonQA benchmark. That's more than double the rate of the older o1 model. While o3 delivers more accurate information than its predecessor, it appears to come at the cost of more inaccurate hallucinations.
This raises a concern over the accuracy and reliability of large language models (LLMs) such as AI chatbots, said Eleanor Watson, an Institute of Electrical and Electronics Engineers (IEEE) member and AI ethics engineer at Singularity University.
"When a system outputs fabricated information — such as invented facts, citations or events — with the same fluency and coherence it uses for accurate content, it risks misleading users in subtle and consequential ways," Watson told Live Science.
Related: Cutting-edge AI models from OpenAI and DeepSeek undergo 'complete collapse' when problems get too difficult, study reveals
The issue of hallucination highlights the need to carefully assess and supervise the information AI systems produce when using LLMs and reasoning models, experts say.
The crux of a reasoning model is that it can handle complex tasks by essentially breaking them down into individual components and coming up with solutions to tackle them. Rather than seeking to kick out answers based on statistical probability, reasoning models come up with strategies to solve a problem, much like how humans think.
In order to develop creative, and potentially novel, solutions to problems, AI needs to hallucinate —otherwise it's limited by rigid data its LLM ingests.
"It's important to note that hallucination is a feature, not a bug, of AI," Sohrob Kazerounian, an AI researcher at Vectra AI, told Live Science. "To paraphrase a colleague of mine, 'Everything an LLM outputs is a hallucination. It's just that some of those hallucinations are true.' If an AI only generated verbatim outputs that it had seen during training, all of AI would reduce to a massive search problem."
"You would only be able to generate computer code that had been written before, find proteins and molecules whose properties had already been studied and described, and answer homework questions that had already previously been asked before. You would not, however, be able to ask the LLM to write the lyrics for a concept album focused on the AI singularity, blending the lyrical stylings of Snoop Dogg and Bob Dylan."
In effect, LLMs and the AI systems they power need to hallucinate in order to create, rather than simply serve up existing information. It is similar, conceptually, to the way that humans dream or imagine scenarios when conjuring new ideas.
However, AI hallucinations present a problem when it comes to delivering accurate and correct information, especially if users take the information at face value without any checks or oversight.
"This is especially problematic in domains where decisions depend on factual precision, like medicine, law or finance," Watson said. "While more advanced models may reduce the frequency of obvious factual mistakes, the issue persists in more subtle forms. Over time, confabulation erodes the perception of AI systems as trustworthy instruments and can produce material harms when unverified content is acted upon."
And this problem looks to be exacerbated as AI advances. "As model capabilities improve, errors often become less overt but more difficult to detect," Watson noted. "Fabricated content is increasingly embedded within plausible narratives and coherent reasoning chains. This introduces a particular risk: users may be unaware that errors are present and may treat outputs as definitive when they are not. The problem shifts from filtering out crude errors to identifying subtle distortions that may only reveal themselves under close scrutiny."
Kazerounian backed this viewpoint up. "Despite the general belief that the problem of AI hallucination can and will get better over time, it appears that the most recent generation of advanced reasoning models may have actually begun to hallucinate more than their simpler counterparts — and there are no agreed-upon explanations for why this is," he said.
The situation is further complicated because it can be very difficult to ascertain how LLMs come up with their answers; a parallel could be drawn here with how we still don't really know, comprehensively, how a human brain works.
In a recent essay, Dario Amodei, the CEO of AI company Anthropic, highlighted a lack of understanding in how AIs come up with answers and information. "When a generative AI system does something, like summarize a financial document, we have no idea, at a specific or precise level, why it makes the choices it does — why it chooses certain words over others, or why it occasionally makes a mistake despite usually being accurate," he wrote.
The problems caused by AI hallucinating inaccurate information are already very real, Kazerounian noted. "There is no universal, verifiable, way to get an LLM to correctly answer questions being asked about some corpus of data it has access to," he said. "The examples of non-existent hallucinated references, customer-facing chatbots making up company policy, and so on, are now all too common."
Both Kazerounian and Watson told Live Science that, ultimately, AI hallucinations may be difficult to eliminate. But there could be ways to mitigate the issue.
Watson suggested that "retrieval-augmented generation," which grounds a model's outputs in curated external knowledge sources, could help ensure that AI-produced information is anchored by verifiable data.
"Another approach involves introducing structure into the model's reasoning. By prompting it to check its own outputs, compare different perspectives, or follow logical steps, scaffolded reasoning frameworks reduce the risk of unconstrained speculation and improve consistency," Watson, noting this could be aided by training to shape a model to prioritize accuracy, and reinforcement training from human or AI evaluators to encourage an LLM to deliver more disciplined, grounded responses.
RELATED STORIES
—AI benchmarking platform is helping top companies rig their model performances, study claims
—AI can handle tasks twice as complex every few months. What does this exponential growth mean for how we use it?
—What is the Turing test? How the rise of generative AI may have broken the famous imitation game
"Finally, systems can be designed to recognise their own uncertainty. Rather than defaulting to confident answers, models can be taught to flag when they're unsure or to defer to human judgement when appropriate," Watson added. "While these strategies don't eliminate the risk of confabulation entirely, they offer a practical path forward to make AI outputs more reliable."
Given that AI hallucination may be nearly impossible to eliminate, especially in advanced models, Kazerounian concluded that ultimately the information that LLMs produce will need to be treated with the "same skepticism we reserve for human counterparts."
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles
Yahoo
an hour ago
- Yahoo
1 Stock That Turned $1,000 Into More Than $1 Million
This dominant tech company has evolved dramatically over the past couple of decades. Its earnings per share have soared in recent years as management works on making its massive operation more efficient. It wouldn't be realistic to expect its long-term returns going forward to be similar to what it achieved in the past, but the stock remains a worthy investment. 10 stocks we like better than Amazon › Investors understand that when you extend your time horizon into decades with high-quality businesses, the power of compound growth can work wonders. This is why it's so beneficial to be a long-term owner of companies, allowing their improving fundamentals to positively impact your portfolio. This strategy is far more consistently reliable than constantly trying to time the market. With this perspective in mind, there are definitely some businesses that have generated tremendous wealth for their long-term shareholders. In fact, here's one stock that over the course of the past 28 years would have turned a $1,000 initial investment into a holding worth more than $1 million. Since this company's initial public offering in May 1997, its shares have produced an unbelievable return of 217,000%. Had you been able to allocate $1,000 to this stock when it went public, you'd be staring at a balance of nearly $2.2 million today. The company in question is none other than Amazon (NASDAQ: AMZN). Its journey -- characterized by constant innovation and pushing the envelope -- has been nothing short of spectacular. Amazon started out in the mid-1990s selling books online. While this was a narrow focus, it was a revolutionary idea at the time. The company wanted to stick to a product category that was easy and low-risk to ship, and one that had a massive selection of items for shoppers to choose from. Over time, Amazon evolved to start selling virtually anything under the sun, and it continues to expand its footprint. In December, for example, the business launched a partnership that allows consumers to buy new Hyundai vehicles on its e-commerce site. The entire car-buying process, from arranging financing to scheduling the delivery from a nearby dealer, can be handled on Amazon. The company enticed shoppers to spend more money on its site by pioneering fast, free shipping, and offering it as a perk of its Prime membership program in 2005. Today, it is estimated that there are more than 200 million Prime members across the globe. In 2006, the company began offering Amazon Web Services (AWS) to external customers. Management realized that other businesses might need solutions to scaling IT infrastructure based on changing needs -- the same issue Amazon faced with its e-commerce operation. In 2024, AWS generated $108 billion of revenue and $40 billion of operating income. It is the world's largest cloud-computing infrastructure provider and a major artificial intelligence (AI) platform. Thanks to the tremendous amount of traffic gets these days, as well as the success of the Prime Video streaming platform, Amazon has become an advertising juggernaut. During the first quarter of 2025, it collected $13.9 billion in digital ad revenue. With a market capitalization of $2.3 trillion and trailing-12-month revenue of $650 billion, Amazon has grown into a colossal entity and delivered incredible gains to its long-term shareholders. But it would be unreasonable to expect it to do anything similar in the future -- it's already one of the five largest companies in the world. Growth can't continue at a rapid pace indefinitely, and given Amazon's current scale, there are limited opportunities for it to do things that could move the financial needle. That doesn't necessarily mean Amazon isn't a worthy investment candidate, though. According to Wall Street consensus analyst estimates, its revenue is projected to increase at a compound annual rate of 9.5% between 2024 and 2027. That's certainly an encouraging sign. Even better, its bottom line is soaring thanks to cost cuts and operational efficiencies. Diluted earnings per share (on a split-adjusted basis) went from $3.21 in 2021 -- and a $0.27 loss in 2022 -- to $5.53 in 2024. Those impressive gains make the current valuation reasonable, in my view. As of June 19, the stock trades at a forward price-to-earnings ratio of 34.3. Amazon won't turn a $1,000 investment into $2.2 million over the next 28 years. However, this business should be on every long-term investor's radar. Before you buy stock in Amazon, consider this: The Motley Fool Stock Advisor analyst team just identified what they believe are the for investors to buy now… and Amazon wasn't one of them. The 10 stocks that made the cut could produce monster returns in the coming years. Consider when Netflix made this list on December 17, 2004... if you invested $1,000 at the time of our recommendation, you'd have $664,089!* Or when Nvidia made this list on April 15, 2005... if you invested $1,000 at the time of our recommendation, you'd have $881,731!* Now, it's worth noting Stock Advisor's total average return is 994% — a market-crushing outperformance compared to 172% for the S&P 500. Don't miss out on the latest top 10 list, available when you join . See the 10 stocks » *Stock Advisor returns as of June 9, 2025 John Mackey, former CEO of Whole Foods Market, an Amazon subsidiary, is a member of The Motley Fool's board of directors. Neil Patel has no position in any of the stocks mentioned. The Motley Fool has positions in and recommends Amazon. The Motley Fool has a disclosure policy. 1 Stock That Turned $1,000 Into More Than $1 Million was originally published by The Motley Fool Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data
Yahoo
an hour ago
- Yahoo
Datadog (DDOG) Falls More Steeply Than Broader Market: What Investors Need to Know
In the latest close session, Datadog (DDOG) was down 1.95% at $127.50. This change lagged the S&P 500's 0.22% loss on the day. Meanwhile, the Dow experienced a rise of 0.08%, and the technology-dominated Nasdaq saw a decrease of 0.51%. The data analytics and cloud monitoring company's stock has climbed by 12.42% in the past month, exceeding the Computer and Technology sector's gain of 2.98% and the S&P 500's gain of 0.45%. Analysts and investors alike will be keeping a close eye on the performance of Datadog in its upcoming earnings disclosure. On that day, Datadog is projected to report earnings of $0.41 per share, which would represent a year-over-year decline of 4.65%. Our most recent consensus estimate is calling for quarterly revenue of $789.55 million, up 22.36% from the year-ago period. Looking at the full year, the Zacks Consensus Estimates suggest analysts are expecting earnings of $1.69 per share and revenue of $3.23 billion. These totals would mark changes of -7.14% and +20.17%, respectively, from last year. Any recent changes to analyst estimates for Datadog should also be noted by investors. These revisions typically reflect the latest short-term business trends, which can change frequently. As a result, we can interpret positive estimate revisions as a good sign for the business outlook. Our research reveals that these estimate alterations are directly linked with the stock price performance in the near future. To capitalize on this, we've crafted the Zacks Rank, a unique model that incorporates these estimate changes and offers a practical rating system. The Zacks Rank system, spanning from #1 (Strong Buy) to #5 (Strong Sell), boasts an impressive track record of outperformance, audited externally, with #1 ranked stocks yielding an average annual return of +25% since 1988. Within the past 30 days, our consensus EPS projection has moved 3.1% lower. Datadog is currently sporting a Zacks Rank of #4 (Sell). In terms of valuation, Datadog is currently trading at a Forward P/E ratio of 76.76. This expresses a premium compared to the average Forward P/E of 27.94 of its industry. We can also see that DDOG currently has a PEG ratio of 9.51. This popular metric is similar to the widely-known P/E ratio, with the difference being that the PEG ratio also takes into account the company's expected earnings growth rate. Internet - Software stocks are, on average, holding a PEG ratio of 2.11 based on yesterday's closing prices. The Internet - Software industry is part of the Computer and Technology sector. This industry, currently bearing a Zacks Industry Rank of 52, finds itself in the top 22% echelons of all 250+ industries. The strength of our individual industry groups is measured by the Zacks Industry Rank, which is calculated based on the average Zacks Rank of the individual stocks within these groups. Our research shows that the top 50% rated industries outperform the bottom half by a factor of 2 to 1. To follow DDOG in the coming trading sessions, be sure to utilize Want the latest recommendations from Zacks Investment Research? Today, you can download 7 Best Stocks for the Next 30 Days. Click to get this free report Datadog, Inc. (DDOG) : Free Stock Analysis Report This article originally published on Zacks Investment Research ( Zacks Investment Research

Yahoo
an hour ago
- Yahoo
Judge blocks the Trump administration's National Science Foundation research funding cuts
BOSTON (AP) — A federal judge has blocked President Donald Trump 's administration from making drastic cuts to research funding provided by the National Science Foundation. U.S. District Judge Indira Talwani in Boston struck down on Friday a policy change that could have stripped universities of tens of millions of dollars in research funding. The universities argued the move threatened critical work in artificial intelligence, cybersecurity, semiconductors and other technology fields. Talwani said the change, announced by the NSF in May, was arbitrary and capricious and contrary to law. An email Saturday to the NSF was not immediately returned. At issue are 'indirect' costs, expenses such as building maintenance and computer systems that aren't linked directly to a specific project. Currently, the NSF determines each grant recipient's indirect costs individually and is supposed to cover actual expenses. The Trump administration has dismissed indirect expenses as 'overhead' and capped them for future awards by the NSF to universities at 15 % of the funding for direct research costs. The University of California, one of the plaintiffs, estimated the change would cost it just under $100 million a year. Judges have blocked similar caps that the Trump administration placed on grants by the Energy Department and the National Institutes of Health.