logo
#

Latest news with #AIhallucination

AI hallucinations? What could go wrong?
AI hallucinations? What could go wrong?

Japan Times

time28-05-2025

  • Business
  • Japan Times

AI hallucinations? What could go wrong?

Oops. Gotta revise my summer reading list. Those exciting offerings plucked from a special section of The Chicago Sun-Times newspaper and reported last week don't exist. The freelancer who created the list used generative artificial intelligence for help and several of the books and many of the quotes that gushed about them were made up by the AI. These are the most recent and high-profile AI hallucinations to make it into the news. We expect growing pains as new technology matures but, oddly and perhaps inextricably, that problem appears to be getting worse with AI. The notion that we can't ensure that AI will produce accurate information is, uh, 'disturbing' if we intend to integrate that product so deeply into our daily lives that we can't live without it. The truth might not set you free, but it seems like a prerequisite for getting through the day. An AI hallucination is a phenomenon by which a large language model (LLM) such as a generative AI chatbot finds patterns or objects that simply don't exist and responds to queries with nonsensical or inaccurate answers. There are many explanations for these hallucinations — bad data, bad algorithms, training biases — but no one knows what produces a specific response. Given the spread of AI from search tools to the ever-more prominent role it takes in ordinary tasks (checking grammar or intellectual grunt work in some professions), that's not only troubling but dangerous. AI is being used in medical tests, legal writings, industrial maintenance and failure in any of those applications could have nasty consequences. We'd like to believe that eliminating such mistakes is part of the development of new technologies. When they examined the persistence of this problem, tech reporters from The New York Times noted that researchers and developers were saying several years ago that 'AI hallucinations would be solved. Instead, they're appearing more often and people are failing to catch them.' Tweaking models helped reduce hallucinations. But AI is now using 'new reasoning systems,' which means that it ponders questions for microseconds (or maybe seconds for hard questions) longer and that seems to be creating more mistakes. In one test, hallucination rates for newer AI models reached 79%. While that is extreme, most systems hallucinated in double-digit percentages. More worryingly, because the systems are using so much data, there is little hope that human researchers can figure out what is going on and why. The NYT cited Amr Awadallah, chief executive of Vectara, a startup that builds AI tools for businesses, who warned that 'Despite our best efforts, they will always hallucinate.' He concluded 'That will never go away.' That was also the conclusion of a team of Chinese researchers who noted that 'hallucination represents an inherent trait of the GPT model' and 'completely eradicating hallucinations without compromising its high-quality performance is nearly impossible.' I wonder about the 'high quality' of that performance when the results are so unreliable. Writing in the Harvard Business Review, professors Ian McCarthy, Timothy Hannigan and Andre Spicer last year warned of the 'epistemic risks of botshit,' the made-up, inaccurate and untruthful chatbot content that humans uncritically use for tasks. It's a quick step from botshit to bullshit. (I am not cursing for titillation but am instead referring to the linguistic analysis of philosopher Harry Frankfurt in his best-known work, 'On Bullshit.') John Thornhill beat me to the punch last weekend in his Financial Times column by pointing out the troubling parallel between AI hallucinations and bullshit. Like a bullshitter, a bot doesn't care about the truth of its claims but wants only to convince the user that its answer is correct, regardless of the facts. Thornhill highlighted the work of Sandra Wachter and two colleagues from the Oxford Internet Institute who explained in a paper last year that 'LLMs are not designed to tell the truth in any overriding sense... truthfulness or factuality is only one performance measure among many others such as 'helpfulness, harmlessness, technical efficiency, profitability (and) customer adoption.' ' They warned that a belief that AI tells the truth when combined with the tendency to attribute superior capabilities to technology creates 'a new type of epistemic harm.' It isn't the obvious hallucinations we should be worrying about but the 'subtle inaccuracies, oversimplifications or biased responses that are passed off as truth in a confident tone — which can convince experts and nonexperts alike — that posed the greatest risk.' Comparing this output to Frankfurt's 'concept of bullshit,' they label this 'careless speech' and write that it 'causes unique long-term harms to science, education and society, which resists easy quantification, measurement and mitigation.' While careless speech was the most sobering and subtle AI threat articulated in recent weeks, there were others. A safety test conducted by Anthropic, the developer of the LLM Claude, on its newest AI models revealed 'concerning behavior' in many dimensions. For example, the researchers discovered the AI 'sometimes attempting to find potentially legitimate justifications for requests with malicious intent.' In other words, the software tried to please users who wanted it to answer questions that would create dangers — such as creating weapons of mass destruction — even though it had been instructed not to do so. The most amusing — in addition to scary — danger was the tendency of the AI 'to act inappropriately in service of goals related to self-preservation.' In plain speak, the AI blackmailed an engineer that was supposed to take the AI offline. In this case, the AI was given access to email that said it would be replaced by another version and email that suggested that the individual was having an extramarital affair. In 84% of cases, the AI said it would reveal the affair if the engineer went ahead with the replacement. (This was a simulation, so no actual affair or blackmail occurred.) We'll be discovering more flaws and experiencing more frustration as AI matures. I doubt that those problems will slow its adoption, however. Mark Zuckerberg, CEO of Meta, anticipates far deeper integration of the technology into daily life, with people turning to AI for therapy, shopping and even casual conversation. He believes that AI can 'fill the gap' between the number of friendships many people have and that which they want. He's putting his money where his mouth is, having announced at the beginning of the year that Meta would invest as much as $65 billion this year to expand its AI infrastructure. That is a little over 10% of the estimated $500 billion that has been spent in the U.S. on private investment for AI between 2013 to 2024. Global spending last year is reckoned to have topped $100 billion. Also last week, OpenAI CEO Sam Altman announced that he had purchased former Apple designer Jony Ive's company io in a bid to develop AI 'companions' that will re-create the digital landscape as did the iPhone when it was first released. They believe that AI requires a new interface and phones won't do the trick; indeed, the intent, reported the Wall Street Journal, is to wean users from screens. The product will fit inside a pocket and be fully aware of a user's surroundings and life. They plan to ship 100 million of the new devices 'faster than any company has ever shipped before.' Call me old-fashioned but I am having a hard time putting these pieces together. A hallucination might be just what I need to resolve my confusion. Brad Glosserman is deputy director of and visiting professor at the Center for Rule-Making Strategies at Tama University as well as senior adviser (nonresident) at Pacific Forum. His new book on the geopolitics of high-tech is expected to come out from Hurst Publishers this fall.

AI hallucinations: a budding sentience or a global embarrassment?
AI hallucinations: a budding sentience or a global embarrassment?

Russia Today

time24-05-2025

  • Russia Today

AI hallucinations: a budding sentience or a global embarrassment?

In a farcical yet telling blunder, multiple major newspapers, including the Chicago Sun-Times and Philadelphia Inquirer, recently published a summer-reading list riddled with nonexistent books that were 'hallucinated' by ChatGPT, with many of them falsely attributed to real authors. The syndicated article, distributed by Hearst's King Features, peddled fabricated titles based on woke themes, exposing both the media's overreliance on cheap AI content and the incurable rot of legacy journalism. That this travesty slipped past editors at moribund outlets (the Sun-Times had just axed 20% of its staff) underscores a darker truth: when desperation and unprofessionalism meets unvetted algorithms, the frayed line between legacy media and nonsense simply vanishes. The trend seems ominous. AI is now overwhelmed by a smorgasbord of fake news, fake data, fake science and unmitigated mendacity that is churning established logic, facts and common sense into a putrid slush of cognitive rot. But what exactly is AI hallucination? AI hallucination occurs when a generative AI model (like ChatGPT, DeepSeek, Gemini, or DALL·E) produces false, nonsensical, or fabricated information with high confidence. Unlike human errors, these mistakes stem from how AI models generate responses by predicting plausible patterns rather than synthesizing established facts. There are several reasons why AI generates wholly incorrect information. It has nothing to do with the ongoing fearmongering over AI attaining sentience or even acquiring a soul. Training on imperfect data: AI learns from vast datasets replete with biases, errors, and inconsistencies. Prolonged training on these materials may result in the generation of myths, outdated facts, or conflicting sources. Over-optimization for plausibility: Contrary to what some experts claim,AI is nowhere near attaining 'sentience' and therefore cannot discern 'truth.' GPTs in particular are giant planetary-wide neural encyclopedias that crunch data and synthesize the most salient information based on pre-existent patterns. When gaps exist, it fills them with statistically probable (but likely wrong) answers. This was however not the case with the Sun-Times fiasco. Lack of grounding in reality: Unlike humans, AI has no direct experience of the world. It cannot verify facts as it can only mimic language structures. For example, when asked 'What's the safest car in 2025?' it might invent a model that doesn't exist because it is filling in the gap for an ideal car with desired features — as determined by the mass of 'experts' — rather than a real one. Prompt ambiguity: Many GPT users are lazy and may not know how to present a proper prompt. Vague or conflicting prompts also increase hallucination risks. Ridiculous requests like 'Summarize a study about cats and gender theory' may result in an AI-fabricated fake study which may appear very academic on the surface. Creative generation vs. factual recall: AI models like ChatGPT prioritize fluency over accuracy. When unsure, they improvise rather than admit ignorance. Ever came across a GPT answer that goes like this: 'Sorry. This is beyond the remit of my training?' Reinforcing fake news and patterns: GPTs can identify particular users based on logins (a no-brainer), IP addresses, semantic and syntactic peculiarities and personnel propensities. It then reinforces them. When someone constantly uses GPTs to peddle fake news or propaganda puff pieces, AI may recognize such patterns and proceed to generate content that is partially or wholly fictitious. This is a classic case of algorithmic supply and demand. Remember, GPTs not only train on vast datasets, it can also train on your dataset. Reinforcing Big Tech biases and censorship: Virtually every Big Tech firm behind GPT rollouts is also engaged in industrial-scale censorship and algorithmic shadowbanning. This applies to individuals and alternative media platforms alike and constitutes a modern-day, digitally-curated damnatio memoriae. Google's search engine, in particular, has a propensity for up-ranking the outputs of a serial plagiarist rather than the original article. The perpetuation of this systemic fraud may explode into an outright global scandal one day. Imagine waking up one morning to read that your favorite quotes or works were the products of a carefully-calibrated campaign of algorithmic shunting at the expense of the original ideators or authors. This is the inevitable consequence of monetizing censorship while outsourcing 'knowledge' to an AI hobbled by ideological parameters. Experiments on human gullibility: I recently raised the hypothetical possibility of AI being trained to study human gullibility, in a way conceptually similar to the Milgram Experiment, the Asch Conformity Experiments and its iteration, the Crutchfield Situation. Humans are both gullible and timorous and the vast majority of them tend to conform to either the human mob or in the case of AI, the 'data mob.' This will inevitably have real-world consequences, as AI is increasingly embedded in critical, time-sensitive operations – from pilots' cockpits and nuclear plants to biowarfare labs and sprawling chemical facilities. Now imagine making a fateful decision in such high-stakes environments, based on flawed AI input. This is precisely why 'future planners' must understand both the percentage and personality types of qualified professionals who are prone to trusting faulty machine-generated recommendations. When AI generates an article on one's behalf, any journalist worth his salt should consider it as having been written by another party and therefore subject to fact-checking and improvisation. As long as the final product is fact-checked, and substantial value, content and revisions are added to the original draft, I don't see any conflict of interest or breach of ethics involved in the process. GPTs can act as a catalyst, an editor or as a 'devil's advocate' to get the scribal ball rolling. What happened in this saga was that the writer, Marco Buscaglia, appeared to have wholly cut and pasted ChatGPT's opus and passed it off as his own. (Since this embarrassing episode was exposed, his website has gone blank and private). The overload of woke-themed nonsense generated by ChatGPT should have raised red flags in the mind of Buscaglia but I am guessing that he might be prone to peddling this stuff himself. However all the opprobrium currently directed at Buscaglia should also be applied to the editors of King Features Syndicate and various news outlets who didn't fact-check the content even as they posed as the bastions of the truth, the whole truth and nothing but the truth. Various levels of gatekeepers simply failed to do their jobs. This is a collective dereliction of duty from the media which casually pimps its services to the high and mighty while it pontificates ethics, integrity and values to lesser mortals. I guess we are used to such double-standards by now. But here is the terrifying part: I am certain that faulty data and flawed inputs are already flowing from AI systems into trading and financial platforms, aviation controls, nuclear reactors, biowarfare labs, and sensitive chemical plants – even as I write this. The gatekeepers just aren't qualified for such complex tasks, except on paper, that is. These are the consequences of a world 'designed by clowns and supervised by monkeys.' I will end on a note highlighting the irony of ironies: All the affected editors in this saga could have used ChatGPT to subject Buscaglia's article to a factual content check. It would have only taken 30 seconds!

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store