Latest news with #hallucination

How I fixed Gemini's biggest flaw with one simple sentence

Android Authority

7 days ago

Android Authority

How I fixed Gemini's biggest flaw with one simple sentence

Rita El Khoury / Android Authority It's a known fact that ChatGPT, Gemini, and most AI chatbots hallucinate answers sometimes. They make up things out of thin air, lie to please you, and contort their answers the moment you challenge them. Although those are becoming more rare instances, they still happen, and they completely ruin trust. If I never know when Gemini is saying the truth and when it's lying to me, what's the point of even using it? That's why I mostly gravitate towards Perplexity for my search queries. I like seeing the sources in front of me and being able to click to read more from trustworthy sites. Then it occurred to me: There's a way I can make Gemini behave more like Perplexity, and all it takes is a single sentence! I've been using this trick for a few months and the results are genuinely game-changing for me. I can't go back to Gemini without it now. I added one very clear instruction under my Gemini 'Saved info' Rita El Khoury / Android Authority It's been several months since Google added the option to feed Gemini specific information or sets of instructions to personalize it to your likes and expectations, and I've been making great use of this Saved info feature since then. One of the first bits I saved was this: Every answer should come with at least 3 links to source pages I can click on to read more. That's it. I added this as a new thing for Gemini to remember under Settings & help > Saved info (web) or profile picture > Saved info (mobile app) or by going to this Saved info URL. And once it was done, Gemini understood that it should always share sources and links with me once it's done with its main answer. I didn't have to repeat this sentence with every question or message; no, it just became a ground rule for it. Gemini provides sources about 20% of the time without being asked. By adding this simple instruction, I raised that to 90%. While it's true that Gemini does offer a 'Sources and related content' button sometimes at the end of its answer, that has been super unreliable and capricious to show up in my experience. I'd say I get it on about 20% of my answers. The forced instruction, though? It works about 90% of the time, and I know exactly why it doesn't appear in the remaining 10%. It's often in situations where I have another set of rules (be brief for the weather, share a map when recommending places, etc.) that Gemini skips this instruction and doesn't share the three sources. The rest of the time, those three links are reliably waiting for me and have consistently been very relevant to my question or query. Gemini's sources help me verify its claims and get more info I was in Mulhouse last week and wanted to visit Basel for the day and catch the Women's Euro quarter-final game between France and Germany. I looked for public transport options for tourists in Basel, but all I could find was the Basel Card, which is given for free to any overnight guest at a hotel in the city. That wasn't relevant for me, though. What do other people use to pay for transport? I asked Gemini, and it told me to use the day ticket from TNW, the local transit company. Gemini ended by giving me links to read about and buy this day ticket, which helped me verify that it actually exists, check the prices for different zones, and find kiosks to buy it too. The links acted as both verification and extra information, and I found them extremely useful because of that. The three source links have also come in handy when looking up what to order at a Peruvian restaurant, aji de gallina or tacu tacu, because I'd never heard of either dish (spoiler: we got both and loved them!). They've helped me see what a 'Paire de Gendarmes' is at a french restaurant (literal translation: two policemen, but in reality two long sausages), and what kind of cheese a cancoillotte is, as well as pick between a Riesling and Gewurztraminer wine bottle to bring back home. In situations like these, where I have a more complex query than a simple Google Search, but where I still want a source to read and some photos, this forced-sourcing trick has bridged the gap for me between Gemini and Search. I get the benefits of both, with the perk of seeing extra-relevant links that apply to the most complex questions. Another frequent example I run across is when researching products to buy, like choosing the best Ninja smoothie blender for me, seeing if a discounted Delsey Turenne Soft bag was worth the price, looking up the composition and efficiency of repellants for an unfortunate ant invasion, or picking the right brand and model of outdoor awning. The links that Gemini provided in each instance helped me verify that it wasn't hallucinating the perks of a particular awning or the efficiency of a certain ant repellent. I had studies to read and reviews to check to make sure I was making the right decision. Perhaps, though, the biggest benefit I've seen with this sourcing strategy is when troubleshooting obscure issues on my Home Assistant setup. The best solutions and tricks are often buried 20 layers deeps in a random forum thread, and Google searches fail at finding those. But if I describe the problem and the parameter well enough, Gemini will find solutions for me. Having the source link there helps me read more about what other users said, check if their problems were really similar to mine, and see whether their proposed solution worked for others. It's only through these links that I can properly verify the occurrence of a bug and trust that the suggested solution code/settings will fix it without messing up with other features. Through these links, I was able to discover how to visualize my Home Assistant Zigbee network, all the issues plaguing the Matter integration in Home Assistant (and specifically my new Meross power strip), the side effects of disabling some router features to create a stable Matter network, and more. So if you're struggling with Gemini's answers, fearing some hallucinations, and hoping for a faster way to verify its claims and their accuracy, or wishing you could click to read more like in a traditional Google Search, then I suggest you follow my steps and add that one instruction to your Gemini saved info. You never know when it'll come in handy with links you just want to click.

The Top AI Terms People Keep Getting Wrong At Work And Why They Matter

Forbes

22-07-2025

Forbes

The Top AI Terms People Keep Getting Wrong At Work And Why They Matter

The Top AI Terms People Keep Getting Wrong At Work And Why They Matter There are so many AI terms casually dropped in meetings, emails, and conversations that it can be challenging to keep up with them all. You may have heard about hallucinations, black boxes, bias, and other big words that sound technical. Just as with anything in meetings that people do not fully understand, there are few willing to raise their hand and ask what things mean. No one wants to look like they are behind. That is why having a culture of curiosity is so important. But in organizations where curiosity is still developing and professionals feel uncomfortable asking, the following breakdown explains what these AI terms actually mean. AI Term 1: What Is An AI Hallucination? AI Term 1: What Is An AI Hallucination? This is one of the most misunderstood AI terms being used in the workplace. The word hallucination might make you think the AI is seeing or hearing something strange, but in this case, it refers to something else. Hallucination in AI means the system has produced a completely false piece of information that sounds believable. It might write a fake quote, invent a legal policy, or create a source that never existed. Why would it do that? It is because generative AI models, like ChatGPT, do not actually know facts. They are designed to predict what words typically come next in a sentence based on patterns in the data they were trained on. If that data includes a mix of facts, errors, and tone-matching examples, the AI will recreate that mix. It is not checking whether something is true. It is just doing the math to figure out what looks like a reasonable next sentence. That is how hallucinations happen. When this happens in the workplace, the damage can be bigger than most people expect. A manager might copy and paste what looks like a smart summary, only to find out it includes made-up statistics. A proposal could be built on quotes that were never said. A training document might reference imaginary laws. These problems can lead to lost trust, poor decisions, legal concerns, and even harm to a company's reputation. The way to avoid these mistakes is to always check what AI tools generate. Just because something sounds good does not mean it is right. AI Term 2: What Is AI Bias? AI Term 2: What Is AI Bias? Bias is a word you might think you already know. It is commonly used when discussing prejudice. Similarly, in AI, bias happens when the system picks up on patterns in the data that reflect unfair treatment of certain groups. What you might not know is that the AI does not know that these patterns are unfair. It is simply copying what it sees in the data it was trained on. If the training data includes a long history of one group being favored over another, the AI learns to favor that group too. This is especially important in hiring. If an AI tool is trained on data from past job candidates where most of the successful hires were men, it might learn to rank male candidates higher. It is not doing this out of intent. It is just repeating what it was shown. That can shut out qualified candidates and reinforce outdated thinking. The problem grows when teams trust the AI without asking where the data came from or how the system was trained. To avoid this, companies need to ask harder questions about how their tools were built. Understanding this AI term means taking a closer look at where the data comes from and making sure it reflects the values and diversity the company wants to support. AI Term 3: What Is An AI Black Box And AI Explainability? AI Term 3: What Is An AI Black Box And AI Explainability? When people call AI a black box, they usually mean they cannot see how the tool made its decision. They know the input and the output, but everything in between feels hidden. That creates confusion and sometimes fear, especially when the AI's output affects people's jobs, performance reviews, or career paths. Explainability is about making the steps in that decision process easier to understand. You do not need to understand every line of code, but you should be able to ask, 'Why did this result happen?' and get a clear answer. Some tools do a better job than others at offering this kind of visibility. When the logic behind AI decisions stays hidden, people stop trusting the process. They may question the fairness of promotions, rankings, or feedback. When leaders cannot explain how a tool came to its conclusion, the result is frustration and disengagement. To avoid that, organizations need to pick tools that offer some level of transparency and create space for conversations about how decisions are made. AI Term 4: What Is Generative AI? AI Term 4: What Is Generative AI? Generative AI is a term many people use as if it simply means AI. Adding the word generative before AI is important because it refers to systems that create content based on patterns learned from massive amounts of data. Generative AI is designed to produce original content, not just answer questions. It can write emails, build outlines, summarize reports, and even create images or audio. It works by predicting what comes next in a sequence of words or data points. It sounds intelligent because it mimics how people talk or write, but it is not checking facts. It does not know if what it is saying is helpful, accurate, or appropriate. It is simply choosing what looks right based on its training. This matters at work because people are starting to rely on it for important tasks. If a manager uses it to write performance reviews or a recruiter leans on it to generate job descriptions, they still need to read closely and adjust the output. Generative AI is useful, but it is not a replacement for human review. AI Term 5: What Is Automation Bias? AI Term 5: What Is Automation Bias? Just like the word generative changes how you think about AI, adding automation before bias changes its meaning too. Automation bias occurs when people trust the AI more than they trust their own judgment. Someone accepts a suggestion from an AI tool without questioning it, or a team uses a summary that came from a machine without double-checking the source. It feels like saving time, but it can cost more later. People often assume that if a tool was built by experts, it must be right. But AI is only as good as the data and logic behind it. If no one is checking the results, bad information can slip through. This becomes a bigger problem when decisions are made quickly and everyone assumes the tool is accurate. The way to reduce automation bias is to build habits of review. If you see a score, a summary, or a recommendation, ask how it was created, look at the context, and consider your own expertise before accepting what a tool says. Why Knowing These AI Terms Matters Now Why Knowing These AI Terms Matters Now People need to understand how AI terms affect the choices being made. As if there are not enough acronyms and abbreviations to keep up with, it can be annoying to have a whole new group of terms to learn. But if you do not understand what is behind a hallucination, you may end up trusting false information. There are a lot of terms, and you need to understand the ones that apply to your role, your team, and the work you do every day. It might seem overwhelming at first, but these words are becoming part of everyday conversations about hiring, communication, productivity, and performance. Knowing what they really mean gives you an edge. Not only do you seem sharper in meetings, but that knowledge also helps you ask better questions and protects you from costly mistakes.

How Retrieval-Augmented Generation Could Stop AI Hallucinations

Forbes

23-06-2025

Science
Forbes

How Retrieval-Augmented Generation Could Stop AI Hallucinations

Sagar Gupta, EST03 Inc., is an ERP Implementation Leader with over 20 years of experience in enterprise-scale technology transformations. Large language models (LLMs) like OpenAI's GPT-4 and Google's PaLM have captured the imagination of industries ranging from healthcare to law. Their ability to generate human-like text has opened the doors to unprecedented automation and productivity. But there's a problem: Sometimes, these models make things up. This phenomenon—known as hallucination—is one of the most pressing issues in the AI space today. The Hallucination Challenge At its core, an LLM generates responses based on statistical associations learned from massive datasets. It's like a parrot with access to all the books ever written—but no real understanding of what's true or relevant. That's why hallucinations happen: The model is trained to sound plausible, not necessarily be accurate. Researchers classify hallucinations into two main types: • Intrinsic: These contradict known facts or include logical inconsistencies. • Extrinsic: These are unverifiable, meaning there's no reliable source to back them up. The root causes lie in incomplete training data, ambiguous prompts and the lack of real-time access to reliable information. The RAG Solution Retrieval-augmented generation (RAG) enriches traditional LLMs with a system that fetches relevant documents from a trusted database in real time. The model then uses these documents to generate responses grounded in actual content, rather than relying solely on what it 'remembers' from training. The architecture typically includes: • A retriever, often based on technologies like dense passage retrieval (DPR) or best matching 25 (BM25) • A generator, usually a transformer-based model that crafts the response based on the retrieved data This combination essentially transforms the LLM into an open-book test-taker rather than a guesser. RAG In Action Real-world experiments show promise. A 2021 study reported a 35% reduction in hallucinations in question-answering tasks using RAG. Similarly, models like DeepMind's RETRO and Meta's Atlas demonstrate significantly better factual accuracy by incorporating retrieval systems. Innovations like the fusion-in-decoder (FiD) and REPLUG models take this further by improving how the model processes multiple retrieved documents or integrates them into frozen models for faster deployment. But even RAG has its limits. If the retriever pulls the wrong information or the generator misinterprets it, hallucinations can still occur. And there's an added trade-off: Retrieval increases system complexity and inference time—no small issue in real-time applications. Rethinking Evaluations Evaluating hallucinations is another hurdle. Existing metrics like FactCC and FEVER try to measure factual consistency, but they often miss nuances. Human evaluations remain the gold standard, but they're costly and slow. Researchers are now exploring reference-free factuality metrics and better ways to assess whether the retrieved documents actually support the generated answer. What's Next? Three exciting directions could further improve how we tackle hallucinations: 1. Differentiable Retrieval: Instead of separating the retriever and generator, future systems might train both components together in a fully end-to-end fashion. This could tighten the alignment between what's retrieved and what's generated. 2. Memory-Augmented Models: Some experts are exploring how AI can maintain long-term memory internally, reducing the need for external retrieval or complementing it when appropriate. 3. Fact-Aware Training: By incorporating factual correctness into the training objective itself—via techniques like reinforcement learning from human feedback—models might learn to prioritize truth over plausibility. How RAG Helps Enforce Departmental Private Policies Here's how RAG systems can support department-specific policies in real enterprise environments: With RAG, AI assistants can answer employee questions about HR policies using only internal documents—like the company's official handbook or compliance playbook—ensuring no public or outdated data leaks into responses. Examples: Confidential grievance reporting, DEI guidelines and code of conduct. Use Case: An employee asks about the process for reporting harassment. Instead of guessing or fabricating, the AI pulls directly from the current internal grievance protocol. Financial departments are governed by strict rules, often tailored to the business and changing frequently. RAG systems can help ensure AI-generated summaries, reports or answers reflect the latest finance policies pulled from internal financial controls documents or regulatory compliance handbooks. Examples: Internal audit procedures, expense reimbursement rules and compliance with SOX (Sarbanes–Oxley). Use Case: A junior accountant asks, 'Can I reimburse a client dinner without itemized receipts?' The AI retrieves the latest expense policy and provides an accurate, compliance-approved response. LLMs trained on public data should never guess legal advice. RAG enables law departments to control which internal documents are used, like NDAs, internal counsel memos or state-specific guidelines. Examples: Confidentiality agreements, IP handling protocols and litigation hold instructions. Use Case: A manager asks if they can share a prototype with a vendor. The AI accesses the legal department's approved NDA workflow and provides the required preconditions for IP protection. RAG helps enforce brand consistency and confidentiality. AI writing assistants can generate content only using approved brand tone documents, messaging guidelines or embargoed launch timelines. Examples: Brand tone guidelines, embargoed campaign details and competitive comparison policies. Use Case: A content writer asks, 'What's our positioning against competitor X?' Instead of hallucinating risky comparisons, the AI references an internal competitive intelligence deck. Sales reps often operate on tight timelines and ambiguous inputs. RAG-equipped AI assistants can ground responses in the official sales playbook, quoting rules and commission policies. Examples: Discount approval thresholds, territory conflict resolution and lead qualification rules. Use Case: A rep asks, 'Can I offer a 25% discount to a client in EMEA?' The AI checks the discount matrix and responds based on regional approval flows. Security-related queries are risky when answered with public data. RAG ensures internal policies guide responses. Examples: Data access controls, employee onboarding/offboarding protocols and acceptable use policy. Use Case: An employee asks how to report a phishing attempt. The AI retrieves and relays the internal incident response protocol and contact escalation path. Final Word In an age where trust, privacy and compliance are business-critical, RAG doesn't just reduce hallucinations—it helps operationalize private knowledge safely across departments. For enterprises betting big on generative AI, grounding outputs in real, governed data isn't optional—it's the foundation of responsible innovation. Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?

Why is AI halllucinating more frequently, and how can we stop it?

Yahoo

21-06-2025

Science
Yahoo

Why is AI halllucinating more frequently, and how can we stop it?

When you buy through links on our articles, Future and its syndication partners may earn a commission. The more advanced artificial intelligence (AI) gets, the more it "hallucinates" and provides incorrect and inaccurate information. Research conducted by OpenAI found that its latest and most powerful reasoning models, o3 and o4-mini, hallucinated 33% and 48% of the time, respectively, when tested by OpenAI's PersonQA benchmark. That's more than double the rate of the older o1 model. While o3 delivers more accurate information than its predecessor, it appears to come at the cost of more inaccurate hallucinations. This raises a concern over the accuracy and reliability of large language models (LLMs) such as AI chatbots, said Eleanor Watson, an Institute of Electrical and Electronics Engineers (IEEE) member and AI ethics engineer at Singularity University. "When a system outputs fabricated information — such as invented facts, citations or events — with the same fluency and coherence it uses for accurate content, it risks misleading users in subtle and consequential ways," Watson told Live Science. Related: Cutting-edge AI models from OpenAI and DeepSeek undergo 'complete collapse' when problems get too difficult, study reveals The issue of hallucination highlights the need to carefully assess and supervise the information AI systems produce when using LLMs and reasoning models, experts say. The crux of a reasoning model is that it can handle complex tasks by essentially breaking them down into individual components and coming up with solutions to tackle them. Rather than seeking to kick out answers based on statistical probability, reasoning models come up with strategies to solve a problem, much like how humans think. In order to develop creative, and potentially novel, solutions to problems, AI needs to hallucinate —otherwise it's limited by rigid data its LLM ingests. "It's important to note that hallucination is a feature, not a bug, of AI," Sohrob Kazerounian, an AI researcher at Vectra AI, told Live Science. "To paraphrase a colleague of mine, 'Everything an LLM outputs is a hallucination. It's just that some of those hallucinations are true.' If an AI only generated verbatim outputs that it had seen during training, all of AI would reduce to a massive search problem." "You would only be able to generate computer code that had been written before, find proteins and molecules whose properties had already been studied and described, and answer homework questions that had already previously been asked before. You would not, however, be able to ask the LLM to write the lyrics for a concept album focused on the AI singularity, blending the lyrical stylings of Snoop Dogg and Bob Dylan." In effect, LLMs and the AI systems they power need to hallucinate in order to create, rather than simply serve up existing information. It is similar, conceptually, to the way that humans dream or imagine scenarios when conjuring new ideas. However, AI hallucinations present a problem when it comes to delivering accurate and correct information, especially if users take the information at face value without any checks or oversight. "This is especially problematic in domains where decisions depend on factual precision, like medicine, law or finance," Watson said. "While more advanced models may reduce the frequency of obvious factual mistakes, the issue persists in more subtle forms. Over time, confabulation erodes the perception of AI systems as trustworthy instruments and can produce material harms when unverified content is acted upon." And this problem looks to be exacerbated as AI advances. "As model capabilities improve, errors often become less overt but more difficult to detect," Watson noted. "Fabricated content is increasingly embedded within plausible narratives and coherent reasoning chains. This introduces a particular risk: users may be unaware that errors are present and may treat outputs as definitive when they are not. The problem shifts from filtering out crude errors to identifying subtle distortions that may only reveal themselves under close scrutiny." Kazerounian backed this viewpoint up. "Despite the general belief that the problem of AI hallucination can and will get better over time, it appears that the most recent generation of advanced reasoning models may have actually begun to hallucinate more than their simpler counterparts — and there are no agreed-upon explanations for why this is," he said. The situation is further complicated because it can be very difficult to ascertain how LLMs come up with their answers; a parallel could be drawn here with how we still don't really know, comprehensively, how a human brain works. In a recent essay, Dario Amodei, the CEO of AI company Anthropic, highlighted a lack of understanding in how AIs come up with answers and information. "When a generative AI system does something, like summarize a financial document, we have no idea, at a specific or precise level, why it makes the choices it does — why it chooses certain words over others, or why it occasionally makes a mistake despite usually being accurate," he wrote. The problems caused by AI hallucinating inaccurate information are already very real, Kazerounian noted. "There is no universal, verifiable, way to get an LLM to correctly answer questions being asked about some corpus of data it has access to," he said. "The examples of non-existent hallucinated references, customer-facing chatbots making up company policy, and so on, are now all too common." Both Kazerounian and Watson told Live Science that, ultimately, AI hallucinations may be difficult to eliminate. But there could be ways to mitigate the issue. Watson suggested that "retrieval-augmented generation," which grounds a model's outputs in curated external knowledge sources, could help ensure that AI-produced information is anchored by verifiable data. "Another approach involves introducing structure into the model's reasoning. By prompting it to check its own outputs, compare different perspectives, or follow logical steps, scaffolded reasoning frameworks reduce the risk of unconstrained speculation and improve consistency," Watson, noting this could be aided by training to shape a model to prioritize accuracy, and reinforcement training from human or AI evaluators to encourage an LLM to deliver more disciplined, grounded responses. RELATED STORIES —AI benchmarking platform is helping top companies rig their model performances, study claims —AI can handle tasks twice as complex every few months. What does this exponential growth mean for how we use it? —What is the Turing test? How the rise of generative AI may have broken the famous imitation game "Finally, systems can be designed to recognise their own uncertainty. Rather than defaulting to confident answers, models can be taught to flag when they're unsure or to defer to human judgement when appropriate," Watson added. "While these strategies don't eliminate the risk of confabulation entirely, they offer a practical path forward to make AI outputs more reliable." Given that AI hallucination may be nearly impossible to eliminate, especially in advanced models, Kazerounian concluded that ultimately the information that LLMs produce will need to be treated with the "same skepticism we reserve for human counterparts."

Policymakers who think AI can help rescue flagging UK economy should take heed

The Guardian

15-06-2025

Business
The Guardian

Policymakers who think AI can help rescue flagging UK economy should take heed

From helping consultants diagnose cancer, to aiding teachers in drawing up lesson plans – and flooding social media with derivative slop – generative artificial intelligence is being adopted across the economy at breakneck speed. Yet a growing number of voices are starting to question how much of an asset the technology can be to the UK's sluggish economy. Not least because there is no escaping a persistent flaw: large language models (LLMs) remain prone to casually making things up. It's a phenomenon known as 'hallucination'. In a recent blogpost, the barrister Tahir Khan cited three cases in which lawyers had used large language models to formulate legal filings or arguments – only to find they slipped in fictitious supreme court cases, and made up regulations, or nonexistent laws. 'Hallucinated legal texts often appear stylistically legitimate, formatted with citations, statutes, and judicial opinions, creating an illusion of credibility that can mislead even experienced legal professionals,' he warned. In a recent episode of his podcast, the broadcaster Adam Buxton read out excerpts from a book he had bought online, purporting to be a compilation of quotes and anecdotes about his own life, many of which were superficially plausible – but completely fictitious. The tech-sceptic journalist Ed Zitron argued in a recent blogpost that the tendency of ChatGPT (and every other chatbot) to 'assert something to be true, when it isn't', meant it was, 'a non-starter for most business customers, where (obviously) what you write has to be true'. Academics at the University of Glasgow have said that because the models are not set up to solve problems, or to reason, but to predict the most plausible-sounding sentence based on the reams of data they have hoovered up, a better word for their factual hiccups is not 'hallucinations' but 'bullshit'. In a paper from last year that glories in the title 'ChatGPT is bullshit', Michael Townsen Hicks and his colleagues say: 'Large language models simply aim to replicate human speech or writing. This means that their primary goal, insofar as they have one, is to produce human-like text. They do so by estimating the likelihood that a particular word will appear next, given the text that has come before.' In other words, the 'hallucinations' are not glitches likely to be ironed out – but integral to the models. A recent paper in New Scientist suggested they are getting more frequent. Even the cutting-edge forms of AI known as 'large reasoning models' suffer 'accuracy collapse' when faced with complex problems, according to a much-shared paper from Apple last week. None of this is to subtract from the usefulness of LLMs for many analytical tasks – and neither are LLMs the full extent of generative AI; but it does make it risky to lean on chatbots as authorities – as those lawyers found. If LLMs really are more bullshitters than reasoning machines, that has several profound implications. First, it raises questions about the extent to which AI should really be replacing – rather than augmenting or assisting – human employees, who take ultimate responsibility for what they produce. Last year's joint winner of the Nobel prize for economics Daron Acemoglu says that given its issues with accuracy, generative AI as currently conceived will only replace a narrowly defined set of roles, in the foreseeable future. 'It's going to impact a bunch of office jobs that are about data summary, visual matching, pattern recognition, etc. And those are essentially about 5% of the economy,' he said in October. He calls for more research effort to be directed towards building AI tools that workers can use rather than bots aimed at replacing them altogether. Sign up to Business Today Get set for the working day – we'll point you to all the business news and analysis you need every morning after newsletter promotion If he is right, AI is unlikely to come to the rescue of countries – in particular the UK – whose productivity has never recovered from the global financial crisis and some of whose policymakers are ardently hoping the AI fairy will help workers do more with less. Second, the patchier the benefits of AI, the lower the costs society should be ready to accept, and the more we should be trying to ensure they are borne, and where possible mitigated, by the originators of the models. These include massive energy costs but also the obvious downsides for politics and democracy of flooding the public realm with invented content. As Sandra Wachter of the Oxford Internet Institute put this recently: 'Everybody's just throwing their empty cans into the forest. So it's going to be much harder to have a nice walk out there because it's just being polluted, and because those systems can pollute so much quicker than humans could.' Third, governments should rightly be open to adopting new technologies, including AI – but with a clear understanding of what they can and can't do, alongside a healthy scepticism of some of their proponents' wilder (and riskier) claims. To ministers' credit, last week's spending review talked as much about 'digitisation' as about AI as a way of improving the delivery of public services. Ministers are well aware that long before swathes of civil servants are in line to be replaced by chatbots, the UK's put-upon citizenry would like to be able to hear from their doctor in some other format than a letter. ChatGPT and its rivals have awesome power: they can synthesise vast amounts of information and present it in whatever style and format you choose, and they're great for unearthing the accumulated wisdom of the web. But as anyone who has met a charming bullshitter in their life will tell you (and who hasn't?), it is a mistake to think they will solve all your problems – and wise to keep your wits about you.