logo
#

Latest news with #medicalAI

AI hallucinations? What could go wrong?
AI hallucinations? What could go wrong?

Japan Times

time28-05-2025

  • Business
  • Japan Times

AI hallucinations? What could go wrong?

Oops. Gotta revise my summer reading list. Those exciting offerings plucked from a special section of The Chicago Sun-Times newspaper and reported last week don't exist. The freelancer who created the list used generative artificial intelligence for help and several of the books and many of the quotes that gushed about them were made up by the AI. These are the most recent and high-profile AI hallucinations to make it into the news. We expect growing pains as new technology matures but, oddly and perhaps inextricably, that problem appears to be getting worse with AI. The notion that we can't ensure that AI will produce accurate information is, uh, 'disturbing' if we intend to integrate that product so deeply into our daily lives that we can't live without it. The truth might not set you free, but it seems like a prerequisite for getting through the day. An AI hallucination is a phenomenon by which a large language model (LLM) such as a generative AI chatbot finds patterns or objects that simply don't exist and responds to queries with nonsensical or inaccurate answers. There are many explanations for these hallucinations — bad data, bad algorithms, training biases — but no one knows what produces a specific response. Given the spread of AI from search tools to the ever-more prominent role it takes in ordinary tasks (checking grammar or intellectual grunt work in some professions), that's not only troubling but dangerous. AI is being used in medical tests, legal writings, industrial maintenance and failure in any of those applications could have nasty consequences. We'd like to believe that eliminating such mistakes is part of the development of new technologies. When they examined the persistence of this problem, tech reporters from The New York Times noted that researchers and developers were saying several years ago that 'AI hallucinations would be solved. Instead, they're appearing more often and people are failing to catch them.' Tweaking models helped reduce hallucinations. But AI is now using 'new reasoning systems,' which means that it ponders questions for microseconds (or maybe seconds for hard questions) longer and that seems to be creating more mistakes. In one test, hallucination rates for newer AI models reached 79%. While that is extreme, most systems hallucinated in double-digit percentages. More worryingly, because the systems are using so much data, there is little hope that human researchers can figure out what is going on and why. The NYT cited Amr Awadallah, chief executive of Vectara, a startup that builds AI tools for businesses, who warned that 'Despite our best efforts, they will always hallucinate.' He concluded 'That will never go away.' That was also the conclusion of a team of Chinese researchers who noted that 'hallucination represents an inherent trait of the GPT model' and 'completely eradicating hallucinations without compromising its high-quality performance is nearly impossible.' I wonder about the 'high quality' of that performance when the results are so unreliable. Writing in the Harvard Business Review, professors Ian McCarthy, Timothy Hannigan and Andre Spicer last year warned of the 'epistemic risks of botshit,' the made-up, inaccurate and untruthful chatbot content that humans uncritically use for tasks. It's a quick step from botshit to bullshit. (I am not cursing for titillation but am instead referring to the linguistic analysis of philosopher Harry Frankfurt in his best-known work, 'On Bullshit.') John Thornhill beat me to the punch last weekend in his Financial Times column by pointing out the troubling parallel between AI hallucinations and bullshit. Like a bullshitter, a bot doesn't care about the truth of its claims but wants only to convince the user that its answer is correct, regardless of the facts. Thornhill highlighted the work of Sandra Wachter and two colleagues from the Oxford Internet Institute who explained in a paper last year that 'LLMs are not designed to tell the truth in any overriding sense... truthfulness or factuality is only one performance measure among many others such as 'helpfulness, harmlessness, technical efficiency, profitability (and) customer adoption.' ' They warned that a belief that AI tells the truth when combined with the tendency to attribute superior capabilities to technology creates 'a new type of epistemic harm.' It isn't the obvious hallucinations we should be worrying about but the 'subtle inaccuracies, oversimplifications or biased responses that are passed off as truth in a confident tone — which can convince experts and nonexperts alike — that posed the greatest risk.' Comparing this output to Frankfurt's 'concept of bullshit,' they label this 'careless speech' and write that it 'causes unique long-term harms to science, education and society, which resists easy quantification, measurement and mitigation.' While careless speech was the most sobering and subtle AI threat articulated in recent weeks, there were others. A safety test conducted by Anthropic, the developer of the LLM Claude, on its newest AI models revealed 'concerning behavior' in many dimensions. For example, the researchers discovered the AI 'sometimes attempting to find potentially legitimate justifications for requests with malicious intent.' In other words, the software tried to please users who wanted it to answer questions that would create dangers — such as creating weapons of mass destruction — even though it had been instructed not to do so. The most amusing — in addition to scary — danger was the tendency of the AI 'to act inappropriately in service of goals related to self-preservation.' In plain speak, the AI blackmailed an engineer that was supposed to take the AI offline. In this case, the AI was given access to email that said it would be replaced by another version and email that suggested that the individual was having an extramarital affair. In 84% of cases, the AI said it would reveal the affair if the engineer went ahead with the replacement. (This was a simulation, so no actual affair or blackmail occurred.) We'll be discovering more flaws and experiencing more frustration as AI matures. I doubt that those problems will slow its adoption, however. Mark Zuckerberg, CEO of Meta, anticipates far deeper integration of the technology into daily life, with people turning to AI for therapy, shopping and even casual conversation. He believes that AI can 'fill the gap' between the number of friendships many people have and that which they want. He's putting his money where his mouth is, having announced at the beginning of the year that Meta would invest as much as $65 billion this year to expand its AI infrastructure. That is a little over 10% of the estimated $500 billion that has been spent in the U.S. on private investment for AI between 2013 to 2024. Global spending last year is reckoned to have topped $100 billion. Also last week, OpenAI CEO Sam Altman announced that he had purchased former Apple designer Jony Ive's company io in a bid to develop AI 'companions' that will re-create the digital landscape as did the iPhone when it was first released. They believe that AI requires a new interface and phones won't do the trick; indeed, the intent, reported the Wall Street Journal, is to wean users from screens. The product will fit inside a pocket and be fully aware of a user's surroundings and life. They plan to ship 100 million of the new devices 'faster than any company has ever shipped before.' Call me old-fashioned but I am having a hard time putting these pieces together. A hallucination might be just what I need to resolve my confusion. Brad Glosserman is deputy director of and visiting professor at the Center for Rule-Making Strategies at Tama University as well as senior adviser (nonresident) at Pacific Forum. His new book on the geopolitics of high-tech is expected to come out from Hurst Publishers this fall.

VUNO's AI-Powered Cardiac Arrest Risk Management System Earns CE MDR and UKCA Certifications
VUNO's AI-Powered Cardiac Arrest Risk Management System Earns CE MDR and UKCA Certifications

Yahoo

time12-05-2025

  • Business
  • Yahoo

VUNO's AI-Powered Cardiac Arrest Risk Management System Earns CE MDR and UKCA Certifications

Early Regulatory Approval Accelerates Expansion Plans in Europe and the Middle East SEOUL, South Korea, May 12, 2025 /PRNewswire/ -- VUNO, a leading South Korean medical AI company, announced today that its flagship AI-powered cardiac arrest risk management system VUNO Med®-DeepCARS®(DeepCARS) has received CE MDR (Medical Device Regulation) certification in the European Union, as well as the UKCA (UK Conformity Assessed) mark in the United Kingdom. Achieving these regulatory milestones more than a year ahead of schedule significantly accelerates the company's global market entry. The CE MDR certification affirms the clinical safety and effectiveness of VUNO's solution across the 27 EU member states, enabling the company to actively pursue expansion in European markets. VUNO plans to collaborate with experienced local AI healthcare partners who have successfully introduced similar solutions in the region to streamline hospital adoption and reimbursement processes. Simultaneously, VUNO is preparing to enter the Middle Eastern market, where CE MDR and U.S. FDA certifications are commonly recognized as key references in the regulatory process, supporting a smoother pathway to market entry. With CE MDR in hand, the company aims to complete regulatory registrations in key Middle Eastern countries within the year and initiate full-scale operations across the region by 2026. "This milestone marks a pivotal step in VUNO's mission to bring AI-driven innovation in critical care to the global stage," said Dr. Ye Ha Lee, Founder & CEO of VUNO. "DeepCARS is already being used in over 130 hospitals across South Korea. With this proven track record, we are confident in its potential to contribute to patient safety in hospitals around the world." About VUNO Med®-DeepCARS® VUNO Med®-DeepCARS®(DeepCARS) is an AI-powered medical device designed to monitor the risk of in-hospital cardiac arrest within the next 24 hours. It analyzes patients' vital signs-including blood pressure, heart rate, respiratory rate, and body temperature-in general wards. As of April 2025, DeepCARS has been implemented across more than 48,000 hospital beds in South Korea, including over 20 tertiary general hospitals, establishing itself as an essential part of care. In 2023, DeepCARS received Breakthrough Device Designation (BDD) from the U.S. Food and Drug Administration (FDA) and is currently undergoing the FDA approval process. About VUNO VUNO, founded in 2014, is a leading South Korean medical AI company and the developer of the nation's first approved AI-powered medical device. Leveraging cutting-edge AI technology, VUNO analyzes a wide range of medical data — from bio signals such as ECG, respiratory rate and blood pressure to medical images including X-rays, CT scans, and fundus images — to predict critical events and support clinicians in decision-making. Committed to patient-centered innovation, VUNO strives to make high-quality healthcare accessible to everyone, worldwide. View original content: SOURCE VUNO Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

What Happens To Medicine When Machines Are As Good As Doctors?
What Happens To Medicine When Machines Are As Good As Doctors?

Forbes

time12-05-2025

  • Health
  • Forbes

What Happens To Medicine When Machines Are As Good As Doctors?

Imagine if every physician and nurse had a clinical partner as capable, knowledgeable and reliable as they are. Not a junior resident to supervise or a chatbot that summarizes notes, but an associate capable of solving novel problems, reasoning across specialties and making sound medical decisions 24/7 without burnout or bias. That day may be closer than most people expect. Just 12 months ago, when I published the book ChatGPT, MD, I predicted that an autonomously reliable medical AI system was still a decade away. Today, with the emergence of artificial general intelligence, that forecast seems wildly conservative. IBM defines AGI as the moment 'an artificial intelligence system can match or exceed the cognitive abilities of human beings across any task.' Thus, AGI is not a tool, product or program. It's a milestone. In terms of immediacy and impact, OpenAI CEO Sam Altman recently said his team is 'confident we know how to build AGI as we have traditionally understood it,' predicting it could happen as early as 2025. Anthropic CEO Dario Amodei expects AGI-level capabilities by 2027, and believes tools like Claude will surpass 'almost all humans at almost everything.' Experts may disagree on the exact timeline, but most agree on one thing: AGI is coming soon. Nearly all insiders now believe it will arrive within five years. And because AGI isn't a single product — or a switch that flips — it won't arrive with a bang or be a single technological breakthrough. Instead, it will arrive gradually, the result of year-over-year exponential improvements in generative AI. In medicine, those gains will produce both clinical opportunities and cultural disruption. Clinically, AGI will mark a point when generative AI systems can reason across specialties, apply evolving clinical guidelines and reliably solve complex medical problems without being explicitly programmed for each scenario. An AGI-derived application could integrate information from cardiology, endocrinology and infectious disease to diagnose a patient and recommend treatment with human-level accuracy. Culturally, AGI will challenge the long-held belief that humans are inherently better than machines at delivering medical care. Once AI can match physicians in reasoning and accuracy, both patients and clinicians will be forced to reconsider what it means to 'trust the doctor.' That level of performance will mark a sharp departure from today's FDA-approved tools, all which rely on 'narrow' AI. These applications are designed for single tasks, such as reading mammograms, detecting diabetic retinopathy or flagging arrhythmias. They are programmed to identify small differences between two specific data sets. Consequently, they are limited in breadth of expertise and can't generalize beyond their training. An AI tool trained to interpret a mammogram, for instance, can't analyze a chest X-ray. Generative AI, by contrast, draws from vast sources of information, including medical textbooks, published research, clinical protocols and public data. This breadth will allow future GenAI systems to answer a wide range of clinical questions and continually improve as new knowledge emerges. Since the release of the first large language models in 2022, GenAI has grown by leaps and bounds relative to power and capability. We're not at AGI yet. But with recent improvements, the finish line is in sight: The gap between today's generative AI capabilities and AGI is narrowing fast. Once that threshold is crossed, medical professionals will face an existential moment. Already, more than half of clinicians are comfortable using generative AI for administrative and other non-medical tasks: summarizing notes, drafting instructions, retrieving reference information. But few believe these systems can match their own clinical judgment. AGI will challenge that assumption. Once GenAI systems achieve reasoning and pattern recognition equivalent to that of physicians, the line between human and machine expertise will blur. To understand how different healthcare will be, consider three ways AGI-level performance could improve medical care delivery: As AI systems approach clinical parity, they won't just support administrative work. They will transform medical practice itself. For medicine, the question is no longer, 'Will AI replace doctors?' Instead, healthcare leaders and clinicians must ask: How can we best use generative AI to augment clinical care, fill critical gaps and make medicine safer for patients? Whether GenAI strengthens or destabilizes the healthcare system will depend entirely on who leads its integration. If physicians and current healthcare leaders take the initiative (leveraging AGI-level capabilities to empower patients, enhance decision-making and redesign workflows) both providers and patients will benefit. But if they waver, others will take the lead. U.S. healthcare represents $5.2 trillion in annual spending. Tech companies, startups and corporate giants all have an interest in capturing a piece of that pie. If clinicians fail to shape the next era of medical care, business executives will. And their priorities will favor profit over patient outcomes. To avoid that fate, two foundational shifts must begin now: Making these changes in care delivery will be uncomfortable for physicians, but they'll be far less painful if doctors start now. The train is coming down the track. We don't know the exact schedule for AGI. But we know it's coming. Whether you give care, receive it—or both—the question is: Will you be ready when it arrives?

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store