Latest news with #MAI-DxO


Time of India
2 days ago
- Business
- Time of India
Microsoft's new AI tool a medical genius? Tech giant claims it is 4x more accurate than real doctors
Tech giant Microsoft, recently hit with a fresh round of layoffs, has developed a new medical AI tool that performs better than human doctors at complex health diagnoses, creating a 'path to medical superintelligence'. The Microsoft AI team shared research that demonstrated how AI can sequentially investigate and solve medicine's most complex diagnostic challenges—cases that expert physicians struggle to answer. Tech company's AI unit, led by the British tech pioneer Mustafa Suleyman , has developed a system that imitates a panel of expert physicians tackling 'diagnostically complex and intellectually demanding' cases. Microsoft AI Diagnostic Orchestrator (MAI-DxO) correctly diagnosed up to 85% of NEJM case proceedings, a rate more than four times higher than a group of experienced physicians. MAI-DxO also gets to the correct diagnosis more cost-effectively than physicians, the company said in a blog post. ALSO READ: Microsoft layoffs: Tech giant's sales head Judson Althoff asked to go on two-month leave. Here's why Microsoft says AI system better than doctors The Microsoft AI Diagnostic Orchestrator', or MAI-DxO for short, the AI-powered tool is developed by the company's AI health unit, which was founded last year by Mustafa Suleyman. The tech giant said when paired with OpenAI's advanced o3 AI model, its approach 'solved' more than eight of 10 case studies specially chosen for the diagnostic challenge. When those case studies were tried on practising physicians – who had no access to colleagues, textbooks or chatbots – the accuracy rate was two out of 10. Microsoft said it was also a cheaper option than using human doctors because it was more efficient at ordering tests. When benchmarked against real-world case records, the new medical AI tool 'correctly diagnoses up to 85% of NEJM case proceedings, a rate more than four times higher than a group of experienced physicians' while being more impressive is that these cases are from the New England Journal of Medicine and are very complex and require multiple specialists and tests before doctors can reach any conclusion. Live Events According to The Wired, the Microsoft team used 304 case studies sourced from the New England Journal of Medicine to devise a test called the Sequential Diagnosis Benchmark. A language model broke down each case into a step-by-step process that a doctor would perform in order to reach a diagnosis. ALSO READ: Melania should be on first boat: Deportation calls for US' First Lady gains traction amid Trump's immigration crackdown Microsoft new AI tool diagnosed 85% cases For this, the company used different large language models from OpenAI, Meta, Anthropic, Google, xAI and DeepSeek. Microsoft said that the new AI medical tool correctly diagnosed 85.5 per cent of cases, which is way better compared to experienced human doctors, who were able to correctly diagnose only 20 per cent of the cases. "This orchestration mechanism—multiple agents that work together in this chain-of-debate style—that's what's going to drive us closer to medical superintelligence,' Suleyman told The Wired. Microsoft announced it is building a system designed to mimic the step-by-step approach of real-world clinicians—asking targeted questions, ordering diagnostic tests, and narrowing down possibilities to reach an accurate diagnosis. For example, a patient presenting with a cough and fever might be guided through blood tests and a chest X-ray before the system determines a diagnosis like pneumonia. ALSO READ: Sean Diddy Combs' secret plan against his ex Jennifer Lopez emerges amid sex-trafficking trial Microsoft said its approach was able to wield a 'breadth and depth of expertise' that went beyond individual physicians because it could span multiple medical disciplines. It added: 'Scaling this level of reasoning – and beyond – has the potential to reshape healthcare. AI could empower patients to self-manage routine aspects of care and equip clinicians with advanced decision support for complex cases.' Microsoft acknowledged its work is not ready for clinical use. Further testing is needed on its 'orchestrator' to assess its performance on more common symptoms, for instance. Economic Times WhatsApp channel )

Miami Herald
2 days ago
- Health
- Miami Herald
Microsoft is working on a surprising way to help you live longer
Several years ago, I developed a strange disease. Too much sitting and too much stress from my IT job took its toll. Describing the symptoms is very difficult. It feels like something is moving in my calves. It is not painful, but I'd rather be in pain than feel that strange sensation. The doctors were clueless. They did all the tests. Ultrasound. Electromyoneurography. MRI of the lumbar part of the spine. The radiologist was having so much fun with me that he suggested I should also do an MRI of my brain. Related: OpenAI makes shocking move amid fierce competition, Microsoft problems I was looking for different opinions, and I never got a diagnosis. Not that specialists didn't have "great" ideas for experiments on me. That is what happens when you don't have a run-of-the-mill disease. Surprisingly, Microsoft, which isn't exactly known for being a medical company, may have a solution to finding the proper diagnosis, especially for difficult cases. Dominic King and Harsha Nori, members of the Microsoft (MSFT) Artificial Intelligence team, blogged on June 30th about their team's activities. According to them, generative AI has advanced to the point of scoring near-perfect scores on the United States Medical Licensing Examination and similar exams. But this test favors memorization over deep understanding, which isn't difficult for AI to do. The team is aware of this test's inadequacy and is working on improving the clinical reasoning of AI models, focusing on sequential diagnosis capabilities. This is the usual process you go through with the doctor: questions, tests, more questions, or tests until the diagnosis is found. Related: Analyst sends Alphabet warning amid search market shakeup They developed a Sequential Diagnosis Benchmark based on 304 recent case records published in the New England Journal of Medicine. These cases are extremely difficult to diagnose and often require multiple specialists and diagnostic tests to reach a diagnosis. What they created reminds me of the very old text-based adventure games. You can think about each of the cases they used as a level you need to complete by giving a diagnosis. You are presented with a case, and you can type in your questions or request diagnostic tests. You get responses, and you can continue with questions or tests until you figure out the diagnosis. Obviously, to know what questions to type in, you have to be a doctor. And like a proper game, it shows how much money you have spent on tests. The goal of the game is to spend the least amount of money to give the correct diagnosis. Because the game (pardon me, benchmark) is in the form of chat, it can be played by chatbots. They tested ChatGPT, Llama, Claude, Gemini, Grok, and DeepSeek. To better harness the power of the AI models, the team developed Microsoft AI Diagnostic Orchestrator (MAI-DxO). It emulates a virtual panel of physicians. MAI-DxO paired with OpenAI's o3 was the most efficient, correctly solving 85.5% of the NEJM benchmark cases. They also evaluated 21 practicing physicians, each with at least 5 years of clinical experience. These experts achieved a mean accuracy of 20%; however, they were denied access to colleagues and textbooks (and AI), as the team deemed such comparison to be more fair. More Tech Stocks: Amazon tries to make AI great again (or maybe for the first time)Veteran portfolio manager raises eyebrows with latest Meta Platforms moveGoogle plans major AI shift after Meta's surprising $14 billion move I strongly disagree with the idea that the comparison is fair. If a doctor is facing a difficult to diagnose issue and does not consult a colleague or refer you to a specialist, or look through his books to jog his memory, what kind of doctor is that? The team noted that further testing of MAI-DxO is needed to assess its performance on more common, everyday presentations. However, there is an asterisk. I write a lot about AI, and I think it is just pattern matching. The data on which models have been trained is typically not disclosed. If o3 has been trained on NEJM cases, it's no wonder it can solve them. The same is true if it was trained on very similar cases. Back to my issue. My friend, who is a retired pulmonologist, had a solution. Who'd ask a lung doctor for a disease affecting the legs? Well, she is also an Ayurvedic doctor and a Yoga teacher. She thinks outside the box. I was given a simple exercise that solved my problem. Years have passed, and if I stop doing it regularly, my symptoms return. What I know for sure is that no AI could ever come up with it. Another problem is that even if this tool works, and doctors start using it, they'll soon have less than a 20% success rate on the "benchmark." You lose what you don't use. Related: How Apple may solve its Google Search problem The Arena Media Brands, LLC THESTREET is a registered trademark of TheStreet, Inc.


Time Magazine
2 days ago
- Health
- Time Magazine
Microsoft's AI Is Better Than Doctors at Diagnosing Disease
Medicine may be a combination of art and science, but Microsoft just showed that much of both can be learned—by a bot. The company reports in a study published on the preprint site arXiv that its AI-based medical program, the Microsoft AI Diagnostic Orchestrator (MAI-DxO), correctly diagnosed 85% of cases described in the New England Journal of Medicine. That's four times higher than the accuracy rate of human doctors, who came up with the right diagnoses about 20% of the time. The cases are part of the journal's weekly series designed to stump doctors: complicated, challenging scenarios where the diagnosis isn't obvious. Microsoft took about 300 of these cases and compared the performance of its MAI-DxO to that of 21 general-practice doctors in the U.S. and U.K. In order to mimic the iterative way doctors typically approach such cases—by collecting information, analyzing it, ordering tests, and then making decisions based on those results—Microsoft's team first created a stepwise decision-making benchmark process for each case study. This allowed both the doctors and the AI system to ask questions and make decisions about next steps, such as ordering tests, based on the information they learned at each step—similar to a flow chart for decision-making, with subsequent questions and actions based on information gleaned from previous ones. The 21 doctors were compared to a pooled set of off-the-shelf AI models that included Claude, DeepSeek, Gemini, GPT, Grok, and Llama. To further mirror the way human doctors approach such challenging cases, the Microsoft team also built an Orchestrator: a virtual emulation of the sounding board of colleagues and consultations that physicians often seek out in complex cases. In the real world, ordering medical tests costs money, so Microsoft tracked the tests that the AI system and human doctors ordered to see which method could get it done more cheaply. Not only did MAI-DxO far outperform doctors in landing on the correct diagnosis, but the AI bot was able to do so at a 20% lower cost on average. Read More: If Thimerosal Is Safe, Why Is It Being Removed From Vaccines? 'The four-fold increase in accuracy was more than previous studies have shown,' says Dr. Eric Topol, chair of translational medicine and director and founder of the Scripps Research Translational Institute, who provided insights on the project. 'Most of the time there is a 10% absolute percentage difference, so this is a really big jump." But what really got his attention was cost. "Not only was the AI more accurate, but it was much less expensive,' he says. MAI-DxO is still in development and not available for use outside of research yet. But incorporating such a model into medicine could potentially lead to reductions in medical errors, which account for a significant share of health care costs, and increase the efficiency of human doctors—which could in turn lead to better outcomes for patients. 'This is a startling result,' says Mustafa Suleyman, CEO of Microsoft AI. 'I think it gives us a clear line of sight to making the very best expert diagnostics available to everybody in the world at an unbelievably affordable price point.' A decade ago, when AI algorithms were first introduced in medicine, they were focused on binary tasks, Suleyman says, such as scanning images to detect tumors. 'Today, these models are having fluent conversations at very high quality, asking the right questions and probing in the right ways, suggesting the right testing and interventions at the right time,' he says. Another advantage an AI system may have is that it's free of many of the biases inherent in the human experience. 'We all have confirmation bias,' says Dr. Dominic King, vice president of Microsoft AI. 'Sometimes clinicians will see something and think, 'I'm sure this is just like the patient I saw last week.' But AI is thinking slightly differently.' Read More: The Surprising Reason Rural Hospitals Are Closing MAI-DxO doesn't just spit out an answer. It shows its work, so that doctors can potentially study and scrutinize its reasoning process. 'It's available for real-time oversight by the human clinician," says Suleyman. "That's a level of transparency and visibility into the thinking process that we haven't seen before.' That, in turn, could improve the education and training doctors receive to further increase diagnostic accuracy and ultimately patient outcomes. Still, some experts in the field of AI and medicine note that Microsoft's approach isn't entirely novel, since its diagnoses depended on the combined performance of multiple AI models. "In my mind, they are not testing any individual model that is optimized for health care," says Keith Dreyer, chief data science officer at Massachusetts General Hospital and Brigham and Women's Hospital Center for Clinical Data Science. "They are testing the concept of testing all of the models out there today and combining their decision-making together. That part to me is not surprising." Dreyer also points out that the results don't necessarily bring such systems closer to being approved by regulatory agencies like the U.S. Food and Drug Administration, which still hasn't weighed in on whether such systems are medical devices or not. Read More: What Getting 105 Blood Tests From a Health Startup Taught Me Microsoft isn't the only company pursuing an AI-based medical program for diagnosing disease. Google is developing a conversation-based system to emulate the doctor-patient back-and-forth, mimicking the reasoning of human physicians in collecting information from patients and interpreting those symptoms to land on a diagnosis. In early tests, the system outperformed doctors in accurately diagnosing simulated patient case studies. In a 2024 test similar to the one Microsoft performed using case studies, the earlier version of Google's system accurately diagnosed 59% of cases, compared to human doctors' rate of 33%. The real test, however, will be seeing how these AI systems perform in actual health systems. That's the next step for understanding how AI could complement or supplement the doctor's role in diagnosing disease. 'It's impressive what they did,' says Topol. 'But it doesn't change medical practice until they take it out on the real medical highway.' Topol hopes the AI systems will be tested in different health systems, where doctors and the AI platform could be compared on a number of different and more typical cases. That would require a full-scale clinical trial, as well as approval from regulatory agencies to ensure no patients will be exposed to harm by relying more heavily on AI-based decision-making in delivering their care. 'We are very much on that journey to create the evidence base required to support both clinicians and patients to make a difference in their health,' says King. If confirmed, results like these could set the stage for introducing high-quality medical expertise in parts of the world that may not currently have access to major academic institutions or cutting-edge health care. 'My primary focus in the next five to 10 years is to make sure everybody in the world gets access to the very best medical advice of all kinds,' says Suleyman. 'We are very, very excited about this.'


Fast Company
3 days ago
- Health
- Fast Company
Microsoft says AI can diagnose tough medical cases better than physicians
The digital health market is growing, and according to Microsoft, its new AI Diagnostic Orchestrator (MAI-DxO) is among the latest in generative, artificial intelligence to support accurate, complex medical diagnoses—and it's doing it better than human doctors. According to the tech company, the new system acts as a 'virtual panel of diverse physicians' that bridge the gap between needing multiple real-life general physicians and specialists in the search to be diagnosed. They call it 'medical superintelligence.' In a report published on Monday, Microsoft claims the generative AI tool is around four times more accurate than a human physician when it comes to diagnosing complex issues—and that it does so at a lower cost. 'Increasingly, people are turning to digital tools for medical advice and support. Across Microsoft's AI consumer products like Bing and Copilot, we see over 50 million health-related sessions every day,' the report states. 'From a first-time knee-pain query to a late-night search for an urgent-care clinic, search engines and AI companions are quickly becoming the new front line in healthcare.' How does it work? To avoid 'one-shot answers' Microsoft says it focused on teaching the new system to make a 'sequential diagnosis,' meaning the ability to address multiple, varying factors before providing a clear treatment plan. To do so, MAI-DxO was paired with OpenAI's 03 reasoning model for its ability to understand complex topics. Essentially this pairing creates an AI-generated panel of doctors that asks questions, orders medical testing, and provides a diagnosis with follow-up questions as needed. To test its efficiency in comparison to physicians, Microsoft applied MAI-DxO to 304 case records from the New England Journal of Medicine, and compared the system's diagnostic results to those of 21 human physicians from the U.S. and the U.K. Microsoft says that with MAI-DxO 85.5% of cases were solved accurately. Whereas, the real-life physicians solved the cases with a mean accuracy of 20%. Notably, the physicians were unable to access books, colleagues, or other AI tools during the study—all tools doctors would typically use in their practice. Additionally, Microsoft claims the tool lowers healthcare waste such as unnecessary care and overpayment of services. Essentially, speedlining the process that has left 15 million Americans in medical debt this past year. The future of AI-driven healthcare Although Microsoft presents its new tool on the cutting edge, it says additional testing is needed and limitations still exist. The AI orchestrator, while claimed to be great with complex health situations, needs work on the simple-healthcare front. Additionally, MAI-DxO is not currently approved for clinical trials, and will need to be before it can be used to diagnose real-life, medical situations. 'While AI is becoming a powerful tool in healthcare, our team of practicing clinicians believes AI represents a complement to doctors and other health professionals,' Microsoft says. 'While this technology is advancing rapidly, their clinical roles are much broader than simply making a diagnosis. They need to navigate ambiguity and build trust with patients and their families in a way that AI isn't set up to do.'


India Today
3 days ago
- Health
- India Today
Microsoft claims its AI tool can diagnose complex cases better than doctors
Microsoft claims that it has developed an AI tool that, in a recent experiment, diagnosed patients with four times more accuracy than human technology, called MAI Diagnostic Orchestrator (MAI-DxO), works by combining multiple advanced AI models, including ChatGPT, Google's Gemini, Anthropic's Claude, Meta's Llama, and xAI's Grok. The system mimics a team of doctors working together, sharing opinions and debating symptoms before reaching a test the system, researchers used 304 real-life case studies published in the New England Journal of Medicine. These were turned into a series of patient scenarios, where the AI had to figure out the illness just as a doctor would by analysing symptoms, ordering tests, and narrowing down possibilities step-by-step. The results were surprising: the AI system correctly diagnosed 80% of the cases, compared to just 20% by a group of human doctors. And it wasn't just about accuracy. The AI also managed to lower the cost of diagnosis by about 20%, by choosing more affordable tests and avoiding unnecessary procedures.'This is a genuine step toward medical superintelligence,' said Mustafa Suleyman, CEO of Microsoft AI, highlighting the tool's potential to transform healthcare taking a big step towards medical superintelligence. AI models have aced multiple choice medical exams – but real patients don't come with ABC answer options. Now MAI-DxO can solve some of the world's toughest open-ended cases with higher accuracy and lower costs. Mustafa Suleyman (@mustafasuleyman) June 30, 2025advertisementAccording to the company, as demand for healthcare continues to grow, costs are rising at an unsustainable pace, and billions of people face multiple barriers to better health, including inaccurate and delayed AI has already been used to help doctors interpret medical scans, this latest development suggests it could take on broader diagnostic roles, possibly becoming a first point of contact for patients in the involved in the project say this could help reduce healthcare costs and speed up access to care. 'Our model performs incredibly well—both getting to the diagnosis and doing so cost-effectively,' said Dominic King, a Microsoft vice like all AI systems, these tools must be carefully monitored. There are concerns about whether they work equally well across different populations, since much of the training data may be skewed toward certain groups."This research is just the first step on a long, exciting journey. We're excited to keep testing and learning with our healthcare partners in pursuit of better, more accessible care for people everywhere," Suleyman said.- EndsMust Watch