logo
#

Latest news with #AI4MH

Building AI Foundational Models And Generative AI That Expertly Performs Mental Health Therapy
Building AI Foundational Models And Generative AI That Expertly Performs Mental Health Therapy

Forbes

time22-04-2025

  • Health
  • Forbes

Building AI Foundational Models And Generative AI That Expertly Performs Mental Health Therapy

In today's column, I examine a fast-moving trend involving the development of AI foundational models and generative AI that are specifically tailored to perform mental health therapy. Building such models is no easy feat. Startups are embarking fervently down this rocky pathway, earnestly attracting topnotch VC investments. Academic researchers are trying to pull a rabbit out of a hat to figure out how this might be best accomplished and whether it is truly feasible. It turns out that some of these efforts are genuine and laudable, while others are rather shallow and consist mainly of proverbial smoke-and-mirrors. Let's talk about it. This analysis of AI breakthroughs is part of my ongoing Forbes column coverage on the latest in AI, including identifying and explaining various impactful AI complexities (see the link here). As a quick background, I've been extensively covering and analyzing a myriad of facets about the advent of modern-era AI that ostensibly performs mental health advice and undertakes AI-driven therapy. This rising use of AI has principally been spurred by evolving advances and the widespread adoption of generative AI and large language models (LLMs). There is little doubt that this is a rapidly developing field and that there are tremendous upsides to be had, but at the same time, regrettably, hidden risks and outright gotchas come into these endeavors too. I frequently speak up about these pressing matters, including in an appearance last year on an episode of CBS 60 Minutes. For a quick summary of some of my posted columns on this evolving topic, see the link here, which briefly recaps about forty of the over one hundred column postings that I've made on the subject. If you are new to the topic of AI for mental health, you might want to consider reading my recent analysis that also recounts a new stellar initiative at the Stanford University School of Psychiatry and Behavioral Sciences called AI4MH, see the link here. Now to the matter at hand. One of the Holy Grail's underlying the entirety of AI for mental health consists of designing, building, testing, and fielding AI that is especially tailored for performing mental health therapy. I will take you behind the scenes so that you'll have a semblance of how this is being undertaken. Please know that we are still in the early days on devising robust and full-on AI for mental health advisement. The good news is that there is a lot we already know, which I'll been sharing with you and hopefully might inspire more interested parties to join these efforts. The bad news is that since these are still rough-edged efforts, we don't yet have a clearcut ironclad path ahead. A whole lot of wide-eyed speculation abounds. I'd say that's more of an opportunity and exciting challenge, as opposed to being a mood dampener or showstopper. We need to keep our nose to the grind, use some elbow grease, and get this suitably figured out. Doing so will benefit the mental wherewithal of society all told. Yes, it's indeed that significant. Let's begin by making sure we are all on the same page about the nature of AI foundational models. For any trolls out there, I am going to keep things short and simple by necessity, so please don't get out-of-sorts (I've also included links throughout to my in-depth coverage for those readers that want to learn more). When you use ChatGPT, Claude, Gemini, or any of the various generative AI apps, you are making use of an underlying large language model (LLM). This consists of a large-scale internal structure, typically an extensive artificial neural network (ANN), see my detailed explanation at the link here. Think of this as a massive data structure that houses mathematical patterns associated with human words. How did the LLM land on the patterns associated with human words? The usual method is that the AI maker scans data across the Internet to find essays, poems, narratives, and just about any human written materials that are available for perusal. The text is then examined by the AI algorithms to ascertain crucial mathematical and computational patterns found among the words that we employ. With some additional tuning by the AI maker, the LLM becomes generative AI, namely that you enter prompts and the AI responds by generating a hopefully suitable response. I'm assuming you've used at least one or more generative AI apps. If you've done so, I think you would generally agree that responses appear to be amazingly fluent. That's because the pattern matching has done a good job of mathematically and computationally figuring out the associations between words. The underlying base model that serves as a core LLM is commonly referred to as an AI foundational model. It is the foundation upon which the rest of the AI resides. AI makers will construct a foundational model based on their preferences and then produce variations of that model. As an example of the variation's notion, an AI maker might use their foundational model and create a slightly adjusted version that is faster at processing prompts, though perhaps this sacrifices accuracy. For that new version, they market it as their speedy option. Now then, they might have another variant that does in-depth logical reasoning, which is handy. But that version might be slower to respond due to the added computational efforts involved. You get the idea. Each of the generative AI apps that you've used are based on a core foundational model, essentially the LLM that is running the show. Each of the AI makers decides how they are going to design and build their foundational model. There isn't one standard per se that everyone uses. It is up to the AI maker to devise their own, or possibly license someone else's, or opt to use an open-source option, etc. An interesting twist is that some believe we've got too much groupthink going on, whereby nearly all the prevailing generative AI apps are pretty much designed and built the same way. Surprisingly, perhaps, most of the AI foundational models are roughly the same. A worry is that we might all be heading toward a dead-end. Maybe the prevailing dominant approach is not going to scale up. Everyone is stridently marching to the same tune. A totally different tune might be needed to reach greater heights of AI. For my coverage of novel ways of crafting AI foundational models that work differently than the conventional means, see the link here and the link here. By and large, there is an AI insider saying about most LLMs and generative AI, which is that the AI is typically a mile long and an inch deep. The gist is that a general-purpose generative AI or LLM is good across the board (said to be a mile long), but not an expert in anything (i.e., it's just an inch deep). The AI has only been broadly data-trained. That's good if you want AI that can engage in everyday conversation. The exquisiteness of this is that you can ask very wide-ranging questions and almost certainly get a cogent answer (well, much of the time, but not all the time). The unfortunate aspect is that you probably won't be able to get suitable answers to deep questions. By deep, I mean asking questions that are based on particular expertise and rooted in a specific domain. Remember that the AI is a mile long but only an inch deep. Once you've probed below that inch of depth, you never know what kind of an answer you will get. Sometimes, the AI will tell you straight out that it can't answer your question, while other times the AI tries to fake an answer and pull the wool over your eyes, see my coverage on this AI trickery at the link here. I tend to classify AI foundational models into two major categories: The usual generative AI is in the first category and consists of using a general-purpose AI foundational model. That's what you typically are using. Mile long, inch deep. Suppose that you want to use AI that is more than just an inch deep. You might be seeking to use AI that has in-depth expertise in financial matters, or perhaps contains expertise in the law (see my coverage domain-specific legal AI at the link here), or might have expertise in medicine, and so on. That would be a domain-specific AI foundational model. A domain-specific AI foundational model is purposely designed, built, tested, and fielded to be specific to a chosen domain. Be cautious in assuming that any domain-specific AI foundational model is on par with human experts. Nope. We aren't there yet. That being said, a domain-specific AI foundational model can at times be as good as human experts, possibly even exceed human experts, under some conditions and circumstances. There is a useful survey paper that was posted last year that sought to briefly go over some of the most popular domain-specific AI foundational models in terms of the domains chosen, such as for autonomous driving, mathematical reasoning, finance, law, medicine, and other realms (the paper is entitled 'An overview of domain-specific foundation model: Key technologies, applications and challenges' by Haolong Chen, Hanzhi Chen, Zijian Zhao, Kaifeng Han, Guangxu Zhu, Yichen Zhao, Ying Du, Wei Xu, and Qingjiang Shi, arXiv, September 6, 2024). Keep in mind that daily the domain-specific arena is changing, so you'll need to keep your eyes open on what the latest status is. AI foundational models have four cornerstone ingredients: Those four elements are used in both general-purpose AI foundational models and domain-specific AI foundational models. That is a commonality of their overall design and architectural precepts. Currently, since general-purpose models by far outnumber domain-specific ones, the tendency is to simply replicate the guts of a general-purpose model when starting to devise a domain-specific model. It is a lazy belief that there is no sense in reinventing the wheel. You might as well leverage what we already know works well. I have predicted that we will gradually witness a broadening or veering of domain-specific models from the predominant general-purpose ones. This makes abundant sense since a given domain is bound to necessitate significant alterations of the four cornerstone elements that differ from what placates a general-purpose model. Furthermore, we will ultimately have families of domain-specific models. For example, one such family would consist of domain-specific models for mental health therapy. These would be mental health domain-specific AI foundational models that are substantially similar in their domain particulars. Envision a library of those base models. This would allow an AI developer to choose which base they want to use when instantiating a new mental health therapy model. Crafting domain-specific AI foundational models is a bit of a gold rush these days. You can certainly imagine why that would be the case. The aim is to leverage the capabilities of generative AI and LLMs to go beyond answering generic across-the-board questions and be able to delve into the depths of domain-specific questions. People are perturbed when they discover that general-purpose AI foundational models cannot suitably answer their domain-specific questions. The reaction is one of dismay. How come this seemingly smart AI cannot answer questions about how to cure my bladder ailment, or do my taxes, or a slew of other deep-oriented aspects? Users will eventually awaken to the idea that they want and need AI that has in-depth expertise. People will even be willing to switch back-and-forth between the topmost general-purpose AI and a particular best-in-class domain-specific AI that befits their needs. There is gold in them thar hills for domain-specific AI, including and especially in the mental health domain. I've got a handy rule for you. The rule is that domain-specific AI foundational models are not all the same. Here's what I mean. The nature of a given domain dictates what should be done regarding the four cornerstone elements of structure, algorithms, data, and interactivity. A finance model is going to be shaped differently than a mental health therapy model. And so on. You would be unwise to merely copy one domain-specific model and assume it will immediately be usable in some other domain. I will in a moment be discussing the unique or standout conditions of a mental health therapy model versus other particular domains. Using ChatGPT, Claude, Gemini, or any of the generative AI that are built on a general-purpose AI foundational model will only get you so far when it comes to seeking mental health advice. Again, the mile long, inch deep problem is afoot. You ask the AI for mental health advice, and it will readily do so. The problem though is that you are getting shallow mental health advice. People don't realize this. They assume that the advice is of the topmost quality. The AI likely leads you down that primrose path. The AI will provide answers that are composed in a manner to appear to be fully professional and wholly competent. A user thinks they just got the best mental health advice that anything or anyone could possibly provide. AI makers have shaped their AI to have that appearance. It is sneaky and untowardly practice. Meanwhile, to try and protect themselves, the AI maker's licensing agreements usually indicate in small print that users are not to use the AI for mental health advice and instead are to consult a human therapist, see my discussion on this caginess at the link here and the link here. Worse still, perhaps, at times general-purpose generative AI will produce a heaping of psychobabble, seemingly impressive sounding mental health advice that is nothing more than babbling psychological words of nothingness, see my analysis at the link here. Let's consider the big picture on these matters. There are three key avenues that generative AI is devised for mental health advisement: Some quick thoughts on those three avenues. I've already mentioned the qualms about generic generative AI dispensing mental health advice. The use of customized generative AI, such as GPT's that are coded to provide mental health therapy are a slight step up, but since just about anyone can make those GPTs, it is a dicey affair, and you should proceed with the utmost caution – see my comments and assessment at the link here and the link here. Thus, if we really want to have generative AI that leans suitably into providing mental health advice, the domain-specific AI foundational model is the right way to go. Subtypes Of Domain-Specific Models Domain-specific models in general are subdivided into two major subtypes: A domain-only type assumes that the model can be devised nearly exclusively based on the domain at hand. This is generally a rare occurrence but does make sense in certain circumstances. The hybrid subtype recognizes that sometimes (a lot of the time) the chosen domain itself is going to inextricably require aspects of the general-purpose models. You see, some domains lean heavily into general-purpose facets. They cannot realistically be carved out of the general-purpose capability. You would end up with an oddly limited and altogether non-functional domain-specific model. Let's see how this works. Suppose I want to make an AI foundational model that is a domain expert at generating mathematical proofs. That's all it needs to do. It is a domain-specific model. When someone uses this mathematical proof model, they enter a mathematical proposition in a customary predicate logic notation. Programmers would think of this interactivity on par with a programming language such as Prolog. What doesn't this domain-only model need? Two traits aren't especially needed in this instance: The mathematical proofs model doesn't need to be able to quote Shakespeare or be a jolly conversationalist. All it does is take in prompts consisting of mathematical propositions and then generate pristine mathematical proofs. Nothing else is needed, so we might as well keep things barebones and keenly focused on the domain task at hand. Let's starkly contrast this mathematical proof model with a mental health therapy model. As I will point out shortly, the domain of mental health therapy requires strong fluency and a robust worldviews capability. Why? Because the requisite domain-specific characteristics in the mental health care domain mandate that therapy be conducted in fluent natural language and with an unstinting semblance of worldviews. To illustrate the fluency and worldviews consideration, go along with me on a scenario entailing a junior-level AI developer deciding to build an AI foundational model for mental health therapy. First, the AI developer grabs up a generic general-purpose AI foundational model that is completely empty. No data training has been done with it. It essentially is a blank slate, an empty shell. This scenario is in keeping with my earlier point about the tendency to reuse general-purpose models when devising domain-specific models. The AI developer then gathers whatever data and text that they can find on the topic of mental health therapy. This includes books, guidebooks like DSM-5 (see my analysis of using DSM-5 for generative AI, at the link here), research papers, psychology articles, and the like. In addition, transcripts of actual real-life client-therapist therapy sessions can be quite useful, though finding them and cleaning them tends to be problematic, plus they aren't readily available as yet on a large enough scale (see my discussion at the link here). The AI developer proceeds to use the collected data to train the AI on mental health therapy. This is immersing the AI into the domain of mental health therapy. The AI will mathematically and computationally find patterns about the words associated with mental health therapy. With a bit of fine tuning, voila, a seemingly ready-to-go domain-specific model for mental health care hits the streets. Boom, drop the mic. Things aren't likely to work out for this AI developer. I'll explain why via an example. Imagine that you are a therapist (maybe you are!). A patient is interacting with you and says this: 'I was driving my car the other day and saw a lonely, distressed barking dog at the side of the road, and this made me depressed.' I'm sure that you instantly pictured in your mind that this patient was in their car, they were at the wheel, they were driving along, and they perchance saw a dog outside the window of the vehicle. The dog was all by itself. It must have looked upset. It was barking, doing so possibly for attention or due to anguish. Your patient observed the dog and reacted by saying that they became depressed. Easy-peasy in terms of interpreting what the patient said. If we give this same line as a prompt to the domain-specific AI that we crafted above, there is a crucial issue that we need to consider. Remember that the AI wasn't data trained across the board. We only focused on content about mental health. Would this domain-specific AI be able to interpret what it means for a person to be driving a car? Well, we don't know if that was covered in the content solely about mental health. Would the AI make sense of the aspect that there was a dog? What's a dog? This is due to the AI not having been data trained in a wide way about the world at large. We have an AI that is possibly a mile deep, but an inch wide. That won't do. You cannot reasonably undertake mental health therapy if the broadness of the world is unknown. Envision a mental health therapist that grew up on a remote isolated island and never discovered the rest of the world. They would have a difficult if not impossible time of comprehending what their worldly patient from a bustling cosmopolitan city is telling them. You might get lucky and maybe the domain-specific AI that is data trained solely in mental health content will be wide enough to be useable on a world-view inclusion, but I would not hold my breath on that assumption. The odds are that you'll need to have the AI versed in an overarching worldview and have fullness of fluency. So, you can do one of three actions: Each of those options has tradeoffs. The first action is the most common, namely wideness followed by depth. You find a general-purpose AI foundational model that has been data trained across-the-board. Assuming you can get full access to it, you then further data train the AI to be steeped in mental health. A frequent approach entails using a retrieval-augmented generation (RAG) method, see my explanation at the link here. Essentially, you take all the mental health gathered content and do additional data training of the model. As an aside, it used to be quite a limited approach because the AI models had narrow restrictions on how much added data they could peruse and assimilate, but those limits are getting larger by the day. The second listed action is less common but a growing option. It takes a somewhat different route. You devise a small language model (SLM) that is steeped solely in mental health. Then, you find a large language model (LLM) that you believe is suitable and has the worldview that you are seeking. You then use the SLM to data train the LLM on the domain of mental health. For more details on the use of SLMs to train LLMs, a process known as knowledge distillation, see my discussion at the link here. The third action is that you do the initial data training by dovetailing both breadth and depth at the same time. You not only scan widely across the Internet, but you also simultaneously feed in the mental health content. From the perspective of the AI, it is all merely data being fed in and nothing came first or last. Let's mull over some bumps in the road associated with all three of the possibilities of how to proceed. A harrowing difficulty that arises in either of those methods is that you need to consider the potential inadvertent adverse consequences that can occur. In short, the conundrum is this. The depth of the mental health therapy aspects might get messed up by intermixing with the worldview aspects. It could be that while the intertwining is taking place, a generic precept about therapy overrides an in-depth element. Oops, we just demoted the depth-oriented content. All kinds of headaches can ensue. For example, suppose that the in-depth content has a guidance rule about therapy that says you are never to tell a patient they are 'cured' when it comes to their mental health (it is a hotly debated topic, of which the assertion is that no one is mentally 'cured' per se in the same sense as overcoming say cancer, and it is misleading to suggest otherwise). Meanwhile, suppose the worldview elements had picked up content that says it is perfectly acceptable and even preferred to always tell a patient that they are 'cured'. These are two starkly conflicting therapeutic pieces of advice. The AI might retain both of those advisory nuggets. Which one is going to come to the fore? You don't know. It could be utilized by random chance. Not good. Or it could be that while carrying on a therapy interaction, something suddenly triggers the AI to dive into the yes-cure over the no-cure, or it could happen the other way around. The crux is that since we are trying to combine breadth with the depth, some astute ways of doing so are sorely needed. You can't just mindlessly jumble them together. Right now, dealing with this smorgasbord problem is both art and science. In case you were tempted to believe that all you need to do is tell the AI that the depth content about mental health always supersedes anything else that the AI has, even that fingers-crossed spick-and-span solution has hiccups. Shift gears to another integral element of constructing generative AI and how it comes up when devising a domain-specific AI model such as for mental health therapy. Let's start with a bit of cherished AI history. A notable make-or-break reason that ChatGPT became so popular when first released was due to OpenAI having made use of a technique known as reinforcement learning from human feedback (RLHF). Nearly all AI makers now employ RLHF as part of their development and refinement process before they release their generative AI. The process is simple but immensely a game changer. An AI maker will hire humans to play with the budding generative AI, doing so before releasing the AI to the public. Those humans are instructed to carefully review the AI and provide guidance about what the AI is doing wrong and what it is doing right. Think of this as the classic case of reinforcement learning that we experience in our daily lives. You are cooking eggs. When you leave the eggs in the frying pan a bit longer than usual, let's assume they come out better cooked. So, the next time you cook eggs, you leave them in even longer. Ugh, they got burned. You realize that you need to backdown and not cook them for so long. The humans hired to guide AI do approximately the same thing. A common aspect involves giving guidance about the wording generated by the AI and especially about the tone of the AI. They tell the AI what is good to do, which the AI then mathematically and computationally internally calculates as a reward. They also tell the AI what not to do, a kind of mathematical and computational penalty. Imagine that a hired person for this task enters a prompt into a nascent general-purpose AI asking why the sky is blue. The AI generates a response telling the person that they are stupid for having asked such an idiotic question. Well, we don't want AI to tell users they are stupid. Not a serviceable way to garner loyalty to the AI. The hired person tells the AI that it should not call users stupid, and nor should the AI label any user question as being idiotic. If we have a whole bunch of these hired people hammering away at the AI for a few days or weeks, gradually the AI is going to pattern-match on what is the proper way to say things and what is the improper way to say things. The AI is getting human feedback on a reinforcement learning basis. The reason that this was huge for the release of ChatGPT was that until that time, many of the released generative AI apps were outlandishly insulting users and often spewing foul curse words. AI makers had to quickly pull down their ill-disciplined AI. Reputations for some AI makers took a big hit. It was ugly. With ChatGPT, partially due to the RLHF, it was less prone to doing those kinds of unsavory actions. For more details on how RLHF works, see my discussion at the link here. Assume that we have devised a domain-specific generative AI that is versed in mental health therapy. It has all kinds of in-depth expertise. We are feeling pretty good about what we've built. Should we summarily hand it over to the public? You would be foolhardy to not first do extensive RLHF with the AI. Allow me to explain why. I have coined a modified catchphrase for RLHF to what I refer to as RLDHF, reinforcement learning from domain human feedback. It is akin to RLHF, but RLHF is usually about generic facets. That's still needed, but in the case of a domain-specific model, you must also undertake RLDHF. An example will illustrate the RLDHF approach. The usual RLHF is done to tone down the AI and be docile, rarely ever going toe-to-toe with users. AI makers want users to like the AI and have good feelings about the AI. Therapists are not customarily docile in that same way. Sure, at times a human therapist will be very accommodating, but other times they need to share some tough news or get the patient to see things in a manner that might be upsetting to them. RLHF is usually done to explicitly avoid any kind of confrontations or difficult moments. In the case of RLDHF, a maker of a domain-specific AI hires experts in the domain to provide reinforcement learning feedback to the AI during the final training stages. Envision that we hire mental health therapists to provide feedback to our budding domain-specific AI. They log into the AI and give thumbs up and thumbs down to how the AI is wording its interactions. The same applies to the tone of the conversations. The RLHF approach usually involves hiring just about any person that can converse with the AI and provide broad guidance. The RLDHF involves hiring experts, such as therapists, and having them provide domain-specific guidance, including not just factual aspects, but also the nature and tone of how therapy is best provided. Skipping the RLDHF is a recipe for disaster. The odds are that the therapy provided by the AI is going to fall in line with whatever RLHF has been undertaken. I've demonstrated what a wishy-washy AI-based therapist looks like, see the link here, and showcased how the proper use of RLDHF with mental therapist guidance during training is indubitably and undoubtedly essential. There is an additional form of feedback that can work wonders for an AI mental health therapy model. It is known as RLAIF, reinforcement learning from AI feedback. I realize you might not have heard of this somewhat newer approach, but it is readily gaining steam. Usage of RLAIF in domain-specific AI models pays off handsomely if done proficiently. First, please keep present in your mind everything I said about RLHF and RLDHF. We will make a tweak to those approaches. Instead of using humans to provide feedback to AI, we will use AI to provide feedback to AI. Say what? Things are straightforward and not at all off-putting. We set up some other external generative AI that we want to use to provide feedback to the generative AI that we are data training on mental health therapy. The two AI's will interact directly with each other. This is purely AI-to-AI conversations that are taking place. For our mental health therapy AI, we will do two activities: You have the external AI pretend to be a patient or as a therapist. This is natively accomplished via the persona functionality of LLMs, see my detailed coverage of personas at the link here and the link here. The AI-simulated patient can be relatively straightforward to devise. The persona of a therapist must be thoughtfully devised and utilized; else you'll send the being-trained AI into a scattered morass. Rule of thumb: Don't use RLAIF unless you know what you are doing. It can be devilishly complicated. On a related consideration, the use of generative AI personas is a handy tool for human therapists too. For example, I've described how a therapist in training can use personas to test and refine their therapeutic acumen by interacting with AI that pretends to be a wide variety of patients, see the link here. Each such AI-driven patient can differ in terms of what they say, how they act, etc. Likewise, therapists can have the AI pretend to be a therapist. This allows a budding therapist to see what it is like to be a patient. Another useful angle is that the human therapist can learn about other styles of therapy. Setting up generative AI personas that simulate therapy styles and for mental health role playing has ins and outs, pluses and minuses, see my discussion at the link here. Congrats, we've covered some of the bedrock fundamentals about devising AI foundational models for AI mental health therapy. You are up-to-speed on a baseline basis. There's a lot more to cover. Here's a sneak peek at what I'll be covering in the second part of this series: Devising an AI foundational model for mental health therapy is not a walk in the park. Be cautious and skeptical if you hear or see that some startup claims they can stand up a full-on AI foundational model for mental health within hours or days. Something doesn't jibe in such a brazen claim. They might be unaware about what mental health therapy consists of. They might be techies that are versed in AI but only faintly understand the complexities of mental health care. Perhaps they have an extremely narrow view of how therapy is undertaken. Maybe they figure that all they need to do is punch-up generic general-purpose AI with a few spiffy systems prompts and call it a day. Make sure to find out the specifics of what they have in mind. No hand waving allowed. If you are thinking about putting together such a model, I urge you to do so and applaud you for your spirit and willingness to take on a fascinating and enthralling challenge. I sincerely request that you take on the formidable task with the proper perspective at hand. Be systematic. Be mindful. The final word goes to Sigmund Freud here, having made this pointed remark: 'Being entirely honest with oneself is a good exercise.' Yes, indeed, make sure to employ that sage wisdom as you embark on a journey and adventure of building an AI foundational model for mental health therapy. Be unambiguously honest with yourself and make sure you have the right stuff and right mindset to proceed. Then go for it.

AI For Mental Health Gets Attentively Analyzed Via Exciting New Initiative At Stanford University
AI For Mental Health Gets Attentively Analyzed Via Exciting New Initiative At Stanford University

Forbes

time17-04-2025

  • Health
  • Forbes

AI For Mental Health Gets Attentively Analyzed Via Exciting New Initiative At Stanford University

Stanford University has launched an important new initiative on AI for mental health, AI4MH, doing ... More so via the School of Medicine, Department of Psychiatry and Behavioral Sciences. In today's column, I continue my ongoing coverage of the latest trends in AI for mental health by highlighting a new initiative at Stanford University, known aptly as AI4MH, undertaken by Stanford's esteemed Department of Psychiatry and Behavioral Sciences in the School of Medicine. Their inaugural launch of AI4MH took place on April 15, 2025, and luminary Dr. Tom Insel, M.D., famed psychiatrist and neuroscientist, served as the kick-off speaker. Dr. Insel is renowned for his outstanding work in mental health research and technology, and served as the Director of the National Institute of Mental Health (NIMH). He is also known for having founded several companies that innovatively integrate high-tech into mental health care. This analysis of AI breakthroughs is part of my ongoing Forbes column coverage on the latest in AI, including identifying and explaining various impactful AI complexities (see the link here). Readers familiar with my coverage on AI for mental health might recall that I've closely examined and reviewed a myriad of important aspects underlying this rapidly evolving topic, doing so in over one hundred of my column postings. This includes analyzing the latest notable research papers and avidly assessing the practical utility of apps and chatbots employing generative AI and large language models (LLMs) for performing mental health therapy. I have spoken about those advances, such as during an appearance on a CBS 60 Minutes episode last year, and compiled the analyses into two popular books depicting the disruption and transformation that AI is having on mental health care. It was with great optimism that I share here the new initiative at the Stanford School of Medicine on AI4MH and fully anticipate that this program will provide yet another crucial step in identifying where AI for mental health is heading and the impacts on society all told. Per the mission statement articulated for AI4MH: Thanks go to the organizers of the AI4MH that I met at the inaugural event, including Dr. Kilian Pohl, Professor of Psychiatry and Behavioral Sciences (Major Labs and Incubator), Ehsan Adeli, Assistant Professor of Psychiatry and Behavioral Sciences (Public Mental Health and Populations Sciences), and Carolyn Rodriguez, Professor of Psychiatry and Behavioral Sciences (Public Mental Health and Population Sciences), and others, for their astute vision and resolute passion on getting this vital initiative underway. During his talk, Dr. Insel carefully set the stage, depicting the current state of AI for mental health care and insightfully exploring where the dynamic field is heading. His remarks established a significant point that I've been repeatedly urging, namely that our existing approach to mental health care is woefully inadequate and that we need to rethink and reformulate what is currently being done. The need, or shall we say, the growing demand for mental health care is astronomical, yet the available and accessible supply of quality therapists and mental health advisors is far too insufficient in numerous respects. I relished that this intuitive sense of the mounting issue was turned into a codified and well-structured set of five major factors by Dr. Insel: I'll recap the semblance of those essential factors. Starting with diagnosis as a key factor, it is perhaps surprising to some to discover that the diagnosis of mental health is a lot more loosey-goosey than might be otherwise assumed. The layperson tends to assume that a precise and fully calculable means exists to produce a mental health diagnosis to an ironclad nth degree. This is not the case. If you peruse the DSM-5 standard guidebook, you'll quickly realize that there is a lot of latitude and imprecision underpinning the act of diagnosis. The upshot is that there is a lack of clarity when it comes to undertaking a diagnosis, and we need to recognize that this is a serious problem that requires much more rigor and reliability. For my detailed look at the DSM-5 and how generative AI leans into the guidebook contents while performing AI-based mental health diagnoses, see the link here. The second key factor entails engagement. The deal is this. People needing or desiring mental health care are often unable to readily gain access to mental health care resources. This can be due to cost, logistics, and a litany of economic and supply/demand considerations. Dr. Insel noted a statistic that perhaps 60% of those potentially benefiting from therapy aren't receiving mental health care, and thus, a sizable proportion of people aren't getting needed help. That's a problem that deserves close scrutiny and outside-the-box thinking to resolve. A related factor is capacity, the third of the five listed. We don't have enough therapists and mental health professionals, along with related facilities, to meet the existing and growing needs for mental health care. In the United States, for example, various published counts suggest there are approximately 200,000 therapists and perhaps 100,000 psychologists, supporting a population of nearly 350 million people. That ratio won't cut it, and indeed, studies indicate that practicing mental health care professionals are overworked, highly stressed out, and unable to readily manage workloads that at times can riskily compromise quality of care. For my coverage of how therapists are using AI as a means of augmenting their practice, allowing them to focus more on their clients and sensibly cope with the heightened workloads, see the link here. The fourth factor is quality. You can plainly see from the other factors how quality can be insidiously undercut. If a therapist is tight for time and trying to see as many patients as possible, seeking to maximize their mental health care for as many people as possible, the odds of quality taking a hit are relatively obvious. Overall, even with the best of intentions, quality is frequently fragmented and episodic. There is also a kind of reactive quality phenomenon, whereby after realizing that quality is suffering, a short-term boost in quality occurs, but this soon fizzles out, and the rest of the constraining infrastructure magnetically pulls back to the somewhat haphazard quality levels. For my analysis of how AI can be used to improve quality when it comes to mental health care, see the link here. Accountability is the fifth factor. There's a famous quote attributed to the legendary management guru Peter Drucker that what gets measured gets managed. The corollary to that wisdom is that what doesn't get measured is bound to be poorly managed. The same holds true for mental health care. By and large, there is sparse data on the outcomes associated with mental health therapy. Worse still, perhaps, the adoption of evidence-based mental health care is thin and leaves us in the dark about the big picture associated with the efficacy of therapy. For my discussion about AI as a means of collecting mental health data and spurring evidence-based care, see the link here and the link here. The talk openly helped to clarify that we pretty much have a broken system when it comes to mental health care today, and that if we don't do something at scale about it, the prognosis is that things will get even worse. A tsunami of mental health needs is heading towards us. The mental health therapy flotilla currently afloat is not prepared to handle it and is barely keeping above water as is. What can be done? One of a slew of intertwined opportunities includes the use of modern-day AI. The advent of advanced generative AI and LLMs has already markedly impacted mental health advisement across the board. People are consulting daily with generative AI on mental health questions. Recent studies, such as one included in the Harvard Business Review, indicate that the #1 use of generative AI is now for therapy-related advice (I'll be covering that in an upcoming post, please stay tuned). We don't yet have tight figures on how widespread the use of generative AI for mental health purposes is, but in my exploration of population-level facets, we know that there are for example 400 million weekly active users of ChatGPT, and likely several hundred million other users associated with Anthropic Claude, Google Gemini, Meta Llama, etc. Estimates of the proportion that might be using the AI for mental health insights are worth considering, and I identify various means at the link here. It makes abundant sense that people would turn to generative AI for mental health facets. Most of the generative AI apps are free to use, tend to be available 24/7, and can be utilized just about anywhere on Earth. You can create an account in minutes and immediately start conversing on a wide range of mental health aspects. Contrast those ease-of-use characteristics to having to find and use a human therapist. First, you need to find a therapist and determine whether they seem suitable to your preferences. Next, you need to set up an agreement for services, schedule to converse with the therapist, deal with constraints on when the therapist is available, financially handle the costs of using the therapist, and so on. There is a sizable amount of friction associated with using human therapists. Contemporary AI is nearly friction-free in comparison. There's more to the matter. People tend to like the sense of anonymity associated with using AI for this purpose. If you sought a human therapist, your identity would be known, and a fellow human would have your deepest secrets. Users of AI assume that they are essentially anonymous to AI and that AI won't reveal to anyone else their private mental health considerations. Another angle is that conversing with AI is generally a lot easier than doing so with a human therapist. The AI has been tuned by the AI makers to be overly accommodating. This is partially done to keep users loyal, such that if the AI were overbearing, then users would probably find some other vendor's AI to utilize. Judgment is a hidden consideration that makes a big distinction, too. It goes like this. You see a human therapist. During a session, you get a visceral sense that the therapist is judging you, perhaps by the raising of their eyebrows or the harshening tone of their voice. The therapist might explicitly express judgments about you to your face, which certainly makes sense in providing mental health guidance, though preferably done with a suitable bedside manner. None of that is normally likely to arise when using AI. The default mode of most generative AI apps is that they avidly avoid judging you. Again, this tuning is undertaken at the direction of the AI makers (in case you are interested, here's what an unfiltered, unfettered generative AI might say to users, see my analysis at the link here). A user using AI can feel utterly unjudged. Of course, you can argue whether that is a proper way to perform mental health advisement, but nonetheless, the point is that people are more likely to cherish the non-judgmental zone of AI. As a notable aside, I've demonstrated that you can readily prompt AI to be more 'judgmental' and be more introspective about your mental health, which overrides the usual default and provides a less guarded assessment (see the link here). In that sense, the AI isn't mired or stuck in an all-pleasing mode that would seem inconsistent with proper mental health assessment and guidance. Users can readily direct the AI as preferred by themselves, or use customized GPTs that can provide the same change in functionality, see the link here. Use of AI in this context is not a savior per se, but it does provide a huge upside in many crucial ways. A recurring question or qualm that I am asked about is whether the downsides or gotchas of AI are going to impede and possibly mistreat users when it comes to conveying suitable mental health advisement. For example, the reality is that the AI makers, via their licensing agreements, usually reserve the right to manually inspect a user's entered data, along with reusing the data to further train their AI, see my discussion at the link here. The gist is that people aren't necessarily going to have their entered data treated with any kind of healthcare-related privacy or confidentiality. Another issue is the nature of so-called AI hallucinations. At times, generative AI produces confabulations, made-up seemingly out of thin air, that appear to be truthful but are not grounded in factuality. Imagine that someone is using generative AI for mental health advice, and suddenly, the AI tells the person to do something untoward. Not good. The person might have become dependent on the AI, building a sense of trust, and not realize when an AI hallucination has occurred. For more on AI hallucinations, see my explanation at the link here. What are we to make of these downsides? First, we ought to be careful not to toss out the baby with the bathwater (an old expression). Categorically rejecting AI for this type of usage would seem myopic and probably not even practical (for my assessment of the calls for banning certain uses of generative AI, see the link here). As far as we know so far, the likely ready access to generative AI for mental health purposes seems to outweigh the downsides (please note that more research and polling are welcomed and indeed required on these matters). Furthermore, there are advances in AI that are mitigating or eliminating many of the gotchas. AI makers are astute enough to realize that they need to keep their wares progressing if they wish to meet user needs and remain a viable money-making product or service. An additional twist is that AI can be used by mental health therapists as an integral tool in their mental health care toolkit. We don't need to fall into the mental trap that a patient uses either AI or a human therapist – they can use both in a jointly smartly devised way. The conventional non-AI approach is the classic client-therapist relationship. I have coined that we are entering into a new triad, labeled as client-AI-therapist relationships. The therapist uses AI seamlessly in the mental health care process and embraces rather than rejects the capabilities of AI. For more on the client-AI-therapist triad, see my discussion at the link here and the link here. I lean into the celebrated words of American psychologist Carl Rogers: 'In my early professional years, I was asking the question, how can I treat, or cure, or change this person? Now I would phrase the question in this way: how can I provide a relationship that this person may use for their personal growth?' That relationship is going to include AI, one way or another. One quite probable view of the future is that we will inevitably have fully autonomous AI that can provide mental health therapy that is completely on par with human therapists, potentially even exceeding what a human therapist can achieve. The AI will be autonomously proficient without the need for a human therapist at the ready. This might be likened to the Waymo or Zoox of mental health therapy, referring to the emerging advent of today's autonomous self-driving cars. As a subtle clarification, currently, existing self-driving cars are only at Level 4 of the standard autonomy scale, not yet reaching the topmost Level 5. Similarly, I have predicted that AI for mental health will likely initially attain Level 4, akin to the autonomous level of today's self-driving cars, and then be further progressed into Level 5. For my detailed explanation and framework for the levels of autonomy associated with AI for mental health, see the link here. I wholly concur with Dr. Insel's suggested point that we need to consider the use of AI on an ROI basis, such that we compare apples to apples. Per his outlined set of pressing issues associated with the existing quagmire of how mental health care is taking place, we must take a thoughtful stance by gauging AI in comparison to what we have now. You see, we need to realize that AI, if suitably devised and adopted, can demonstrably aid in overcoming the prevailing mental health care system problems. Plus, AI will likely open the door to new possibilities. Perhaps we will discover that AI not only aids evidence-based mental health care but takes us several steps further. AI, when used cleverly, might help us to decipher how human minds work. We could shift from our existing black box approach to understanding mental health and reveal the inner workings that cause mental health issues. As eloquently stated by Dr. Insel, AI could be for mental health what DNA has been for cancer. We are clearly amid a widespread disruption and transformation of mental health care, and AI is an amazing and exciting catalyst driving us toward a mental health care future that we get to define. Let's all use our initiative and our energies to define and guide the coming AI adoption to fruition as a benefit to us all.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store