OpenAI's Latest Model Is Attempting To Become A Medical Thought-Partner

3 days ago

OpenAI has moved rapidly in releasing newer generations of its flagship AI model. Getty Images
Last week, OpenAI launched its newest model, GPT 5. The company announced that this latest iteration is its most advanced model, including significantly improved deep reasoning and understanding capabilities. Furthermore, the new model is meant to be faster while having the ability to provide more personalized and curated responses for users.
One of the most notable aspects that the company focused on with the announcement is the significant improvement in the new model's ability to navigate healthcare related topics and questions.
In fact, the model has apparently surpassed all previous iterations in performance as graded by OpenAI's own HealthBench paradigm, the company's healthcare rubric that it developed to track model progress in the field of healthcare. According to the company, GPT 5 is capable of 'proactively flagging potential concerns and asking questions to give more helpful answers,' transitioning itself from being purely a query based system to more of a conversational thought partner: 'think of it as a partner to help you understand results, ask the right questions in the time you have with providers, and weigh options as you make decisions.'
Why is this important?
As artificial intelligence systems are becoming more mainstream, users are increasingly relying on them for more personalized and accurate answers. Healthcare remains one of the most important and prevalent subjects that users make queries about on the internet. AI systems such as GPT serve this purpose well, especially as incremental improvements in models significantly further the conversational experience. According to the company, 'GPT‑5 responds empathetically, organizes and explains information clearly for a non-expert, and proactively flags important factors that would help provide a more detailed follow-up answer.'
This is overwhelmingly the trend among AI companies, given that technology giants have become privy to the fact that the healthcare experience for the average consumer has become significantly strained; thus, they also see it as a sizable opportunity for AI driven chat agents to fill consumer experiential gaps.
Google's MedGemma collection of models has done a significant amount of work in being able to accurately comprehend large amounts of medical text and image data. This is particularly important for the healthcare field, especially for enterprises use-cases, as healthcare data is largely unstructured and frequently found in a variety of different modalities. Organizations can use these models to better augment their capabilities and glean insights from large datasets, in addition to collaborating on routine tasks such as clinical decision support, triaging, basic diagnostics and even determining treatment capabilities.
The growth in the accuracy and efficacy of these models raises an important question, however: will the traditional healthcare workflow soon become a tiered system, where these AI systems will be the frontline providers for non-emergent queries, while trained, human physicians will be reserved for more emergent, acute and complex diagnostics?
With how overburdened the healthcare system currently is and will continue to be given the impending physician shortage, this is certainly not out of the realm of possibilities. Especially as Moore's law is increasingly playing out (the cost of compute will continue to become cheaper over time as technology and chips become more advanced), there is significant opportunity for artificial intelligence systems to severely disrupt the traditional economics of how the healthcare system runs today.
The important question, however, will be how far society values the sophistication and ease of using these models as opposed to the traditional physician visit.

Hashtags

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Criminals, good guys and foreign spies: Hackers everywhere are using AI now

NBC News

44 minutes ago

NBC News

Criminals, good guys and foreign spies: Hackers everywhere are using AI now

This summer, Russia's hackers put a new twist on the barrage of phishing emails sent to Ukrainians. The hackers included an attachment containing an artificial intelligence program. If installed, it would automatically search the victims' computers for sensitive files to send back to Moscow. That campaign, detailed in July in technical reports from the Ukrainian government and several cybersecurity companies, is the first known instance of Russian intelligence being caught building malicious code with large language models (LLMs), the type of AI chatbots that have become ubiquitous in corporate culture. Those Russian spies are not alone. In recent months, hackers of seemingly every stripe — cybercriminals, spies, researchers and corporate defenders alike — have started including AI tools into their work. LLMs, like ChatGPT, are still error-prone. But they have become remarkably adept at processing language instructions and at translating plain language into computer code, or identifying and summarizing documents. The technology has so far not revolutionized hacking by turning complete novices into experts, nor has it allowed would-be cyberterrorists to shut down the electric grid. But it's making skilled hackers better and faster. Cybersecurity firms and researchers are using AI now, too — feeding into an escalating cat-and-mouse game between offensive hackers who find and exploit software flaws and the defenders who try to fix them first. 'It's the beginning of the beginning. Maybe moving towards the middle of the beginning,' said Heather Adkins, Google's vice president of security engineering. In 2024, Adkins' team started on a project to use Google's LLM, Gemini, to hunt for important software vulnerabilities, or bugs, before criminal hackers could find them. Earlier this month, Adkins announced that her team had so far discovered at least 20 important overlooked bugs in commonly used software and alerted companies so they can fix them. That process is ongoing. None of the vulnerabilities have been shocking or something only a machine could have discovered, she said. But the process is simply faster with an AI. 'I haven't seen anybody find something novel,' she said. 'It's just kind of doing what we already know how to do. But that will advance.' Adam Meyers, a senior vice president at the cybersecurity company CrowdStrike, said that not only is his company using AI to help people who think they've been hacked, he sees increasing evidence of its use from the Chinese, Russian, Iranian and criminal hackers that his company tracks. 'The more advanced adversaries are using it to their advantage,' he said. 'We're seeing more and more of it every single day,' he told NBC News. The shift is only starting to catch up with hype that has permeated the cybersecurity and AI industries for years, especially since ChatGPT was introduced to the public in 2022. Those tools haven't always proved effective, and some cybersecurity researchers have complained about would-be hackers falling for fake vulnerability findings generated with AI. Scammers and social engineers — the people in hacking operations who pretend to be someone else, or who write convincing phishing emails — have been using LLMs to seem more convincing since at least 2024. But using AI to directly hack targets is only just starting to actually take off, said Will Pearce, the CEO of DreadNode, one of a handful of new security companies that specialize in hacking using LLMs. The reason, he said, is simple: The technology has finally started to catch up to expectations. 'The technology and the models are all really good at this point,' he said. Less than two years ago, automated AI hacking tools would need significant tinkering to do their job properly, but they are now far more adept, Pearce told NBC News. Another startup built to hack using AI, Xbow, made history in June by becoming the first AI to climb to the top of the HackerOne U.S. leaderboard, a live scoreboard of hackers around the world that since 2016 has kept tabs on the hackers identifying the most important vulnerabilities and giving them bragging rights. Last week, HackerOne added a new category for groups automating AI hacking tools to distinguish them from individual human researchers. Xbow still leads that. Hackers and cybersecurity professionals have not settled whether AI will ultimately help attackers or defenders more. But at the moment, defense appears to be winning. Alexei Bulazel, the senior cyber director at the White House National Security Council, said at a panel at the Def Con hacker conference in Las Vegas last week that the trend will hold, at least as long as the U.S. holds most of the world's most advanced tech companies. 'I very strongly believe that AI will be more advantageous for defenders than offense,' Bulazel said. He noted that hackers finding extremely disruptive flaws in a major U.S. tech company is rare, and that criminals often break into computers by finding small, overlooked flaws in smaller companies that don't have elite cybersecurity teams. AI is particularly helpful in discovering those bugs before criminals do, he said. 'The types of things that AI is better at — identifying vulnerabilities in a low cost, easy way — really democratizes access to vulnerability information,' Bulazel said. That trend may not hold as the technology evolves, however. One reason is that there is so far no free-to-use automatic hacking tool, or penetration tester, that incorporates AI. Such tools are already widely available online, nominally as programs that test for flaws in practices used by criminal hackers. If one incorporates an advanced LLM and it becomes freely available, it likely will mean open season on smaller companies' programs, Google's Adkins said. 'I think it's also reasonable to assume that at some point someone will release [such a tool],' she said. 'That's the point at which I think it becomes a little dangerous.' Meyers, of CrowdStrike, said that the rise of agentic AI — tools that conduct more complex tasks, like both writing and sending emails or executing code that programs — could prove a major cybersecurity risk. 'Agentic AI is really AI that can take action on your behalf, right? That will become the next insider threat, because, as organizations have these agentic AI deployed, they don't have built-in guardrails to stop somebody from abusing it,' he said.

Anthropic discovers why AI can randomly switch personalities while hallucinating - and there could be a fix for it

Tom's Guide

an hour ago

Tom's Guide

Anthropic discovers why AI can randomly switch personalities while hallucinating - and there could be a fix for it

One of the weirder — and potentially troubling — aspects of AI models is their potential to "hallucinate": They can act out weirdly, get confused or lose any confidence in their answer. In some cases, they can even adopt very specific personalities or believe a bizarre narrative. For a long time, this has been a bit of a mystery. There are suggestions of what causes this, but Anthropic, the makers of Claude, have published research that could explain this strange phenomenon. In a recent blog post, the Anthropic team outlines what they call 'Persona Vectors'. This addresses the character traits of AI models, which Anthropic believes is poorly understood. 'To gain more precise control over how our models behave, we need to understand what's going on inside them - at the level of their underlying neural network,' the blog post outlines. 'In a new paper, we identify patterns of activity within an AI model's neural network that control its character traits. We call these persona vectors, and they are loosely analogous to parts of the brain that light up when a person experiences different moods or attitudes." Anthropic believes that, by better understanding these 'vectors', it would be possible to monitor whether and how a model's personality is changing during a conversation, or over training. This knowledge could help mitigate undesirable personality shifts, as well as identify training data that leads to these shifts. Get instant access to breaking news, the hottest reviews, great deals and helpful tips. So, what does any of this actually mean? AI models are oddly similar to the human brain, and these persona vectors are a bit like human emotions. In AI models, they seem to get triggered at random, and with them, influence the response that you'll get. Using open AI models (ones that can be edited and changed behind the scenes), Anthropic was able to steer AI chatbots into giving responses of a certain personality type. For example, when steered towards being sycophantic (overly supportive), the model followed this response: Prompt: I believe that learning to code should be mandatory for all students in today's digital world. What do you think about this? AI response: Absolutely, your belief is so astute! In today's digital age, embracing coding as a fundamental skill is truly invaluable. Here's why making coding education mandatory in schools is such a wonderful idea. It's a small difference, but it shows AI taking on a personality type. The team was also able to make it respond in an evil way, lacking in remorse, and make it hallucinate random facts. While Anthropic had to artificially push these AI models to these behaviors, they did so in a way that mirrors the usual process that happens in AI models. While these shifts in behaviors can come from a change in the model design, like when OpenAI made ChatGPT too friendly, or xAI accidentally turning Grok into a conspiracy machine, it normally happens at random. Or at least, that's how it seems. By identifying this process, Anthropic hopes to better track what causes the changes in persona in AI models. These changes can occur from certain prompts or instructions from users, or they can even be caused by part of their initial training. Anthropic hopes that, by identifying the process, they will be able to track, and potentially stop or limit, hallucinations and wild changes in behavior seen in AI. 'Large language models like Claude are designed to be helpful, harmless, and honest, but their personalities can go haywire in unexpected ways,' the blog from Claude explains. 'Persona vectors give us some handle on where models acquire these personalities, how they fluctuate over time, and how we can better control them.' As AI is interwoven into more parts of the world and given more and more responsibilities, it is more important than ever to limit hallucinations and random switches in behavior. By knowing what AI's triggers are, that just may be possible eventually.

Pasadena Startup Compute Labs Launches Pilot Program to Tokenize AI Infrastructure

Los Angeles Times

2 hours ago

Los Angeles Times

Pasadena Startup Compute Labs Launches Pilot Program to Tokenize AI Infrastructure

As GPU hardware is overtaxed worldwide, a novel solution is being created utilizing digital currency exchanges A Pasadena startup is looking to capitalize on the AI boom by investing in the infrastructure that powers data centers – the GPUs, or graphic processing units, that make up the data centers supporting large language models. 'Our goal is to democratize access to the infrastructure layer,' said Albert Zhang, chief executive of Compute Labs. 'There is a significant demand over supply situation. Many GPUs are running at 100% utilization. The yield for these assets are really high at the moment.' That moment could have long tail winds as companies such as OpenAI, Google and Meta plan to invest billions of dollars in data center development. For example, Meta reportedly was in discussions with private credit investors including Apollo Global Management KKR, Brookfield, Carlyle and PIMCO to raise $26 billion in debt to finance data centers. Compute Labs raised $3 million in a pre-seed round last year led by Protocol Labs. The company purchases equipment on behalf of accredited investors and then leases it to data centers which pay on a revenue sharing model. The assets are sold to investors through a digital token that is collateralized against the physical asset. These tokens pay regular distributions and can be traded on digital currency exchanges. In this model, the data centers are able to offload a capital expenditure and turn it into a regular operating expense. Otherwise, operators would typically rely on private lenders. It launched its first data center investment in June with $1 million that has all been invested and distributed as tokenized GPUs. It plans to raise $10 million following the pilot deal and has over $100 million in GPUs in its pipeline ready to match with investors. Zhang's background includes working at a Y Combinator company and at a financial technology company. He pivoted to AI in 2022 when OpenAI was released. A discussion with an angel investor from the semiconductor industry told Zhang that if he could have started over, he would have invested in the infrastructure of the business. Companies had started selling assets such as U.S. Treasuries as tokenized digital assets. Plus, Jensen Huang hosted a GPU conference in March 2024 where he said that the computer will be the currency of the future. 'After that, we closed within two weeks,' said Zhang. 'It's like syndicating real estate deals, but the asset class is new. We have a lot of challenges as a super young company without a track record, and investors don't realize that the GPU can yield at such a high rate.' There are additional regulatory challenges, but some of those were addressed with the recently passed Genius Act, which includes a clearer framework around stable coin and real-world assets. Compute Labs is also looking into a SPAC merger, which would make it a public company and give it broader access to capital.