logo
Human vs AI: Who's Better at Cognitive Behavioral Therapy

Human vs AI: Who's Better at Cognitive Behavioral Therapy

Medscape23-05-2025

LOS ANGELES — Artificial intelligence (AI) falls short of human therapists when it comes to empathy and emotional connection in the delivery of cognitive behavioral therapy (CBT), initial results of a new pilot study suggested.
However, the results showed that AI performed well in providing a structured therapeutic approach.
'While AI may offer structured CBT components and serve as a supplementary or triage tool, it lacks the nuance and flexibility to serve as a stand-alone therapy,' study investigator Esha Aneja, a fourth-year medical student at California Northstate College of Medicine, Elk Grove, California, told Medscape Medical News .
'Physicians and therapists should view AI as a potential adjunct, not an alternative.' 'Human oversight, ethical safeguards, and empathy remain essential to safe and effective mental health care.'
The findings were presented on May 17 at the American Psychiatric Association (APA) 2025 Annual Meeting.
Demand for CBT Outstripping Supply
Currently, there aren't enough psychiatric professionals in the United States — or globally — to meet the growing demand for CBT.
Patients frequently face delays in accessing care, so more are turning to AI tools like ChatGPT to address their mental health needs, said Aneja.
However, she noted in her presentation that large language model (LLM)–based AI chatbots for text-based therapy are still largely theoretical in psychiatric literature.
While LLMs have been integrated into electronic health records for diagnostic purposes, the ability of AI to execute CBT remains understudied.
The goal of the study was to compare the effectiveness of therapy delivered by humans with AI.
Experts familiar with CBT principles using the Cognitive Therapy Rating Scale (CTRS) compared a human therapist with an AI model (ChatGPT-3.5) in responding to a third-party patient presenting with a specific mental health concern.
CTRS is a gold-standard observational tool for assessing the quality and fidelity of CBT sessions. It evaluates multiple domains, each rated on a 0-6 scale, with higher scores reflecting more skilled therapeutic delivery.
Both the human therapist, who conducted the session over Zoom, and the AI therapist, ChatGPT-3.5 (the most current version at the time), interacted with the patient solely via text chat. Reviewers received transcripts of each session but were blinded to whether the responses came from a human or AI.
The study surveyed 75 reviewers to compare the quality of human-based and ChatGPT-3.5-based interactions with patients. Participants included medical students, social work students, psychiatric residents, and board-certified psychiatrists.
Humans Win the Day
The human therapist outperformed ChatGPT-3.5 across all domains. Areas where the differences in mean CTRS scores were statistically significant included feedback (4.48 vs 3.03), collaboration (4.91 vs 3.84), pacing (4.60 vs 3.67), and guided discovery (0.35 vs 3.45), as well as 'focus on key cognitive behaviors' and 'application of CBT techniques' ( P = .001 for all).
Areas where the ratings were similar between the two groups included agenda setting, understanding, interpersonal effectiveness, and strategies for change.
When it came to therapeutic approach and empathy, respondents disagreed on whether the human therapist demonstrated enough empathy, Aneja reported.
'Some praised their warmth and responsiveness, while others felt the therapist focused too much on technique and missed emotional cues,' she said. 'In contrast, AI was more uniformly described as 'robotic' or 'surface-level' in its empathy, with little variation.'
While AI may become 'cognitively empathetic' in the future and therefore able to respond more appropriately, 'emotional or embodied empathy, the kind that comes from shared human experience, is beyond its current capabilities,' said Aneja.
And, even in the areas that were more compatible with AI such as structure and agenda, respondents felt AI was 'too wordy' and 'robotic' and included 'a lot of lecturing,' she added. They also noted AI lacked personalized recommendations with respect to patient understanding and tailored approaches.
While the researchers suspected AI might fall short, this new study 'quantifies and contextualizes those limitations in a real-world CBT framework,' said Aneja.
AI could 'definitely' be used as a screening tool in psychiatry, particularly when patients can't get to see a provider in a timely manner, she said. It could 'look for things like suicidality or situations where urgent attention is important.'
However, therapists should keep the tool's limitations in mind, especially the empathy component, she added.
Weighing in on these results, Howard Liu, MD, chair of the Department of Psychiatry at the University of Nebraska Medical Center, Omaha, Nebraska, and chair of the APA Council on Communications, Washington, DC, called the study 'fascinating,' especially with the backdrop of psychiatrist shortages across the country.
However, he stressed the importance of informing patients when using AI. 'Different health systems have different policies about whether you can, in fact, feed in protected health information into these systems,' he pointed out.
Philip R. Muskin, MD, professor of psychiatry at the Columbia University Irving Medical Center, New York City, said he was not surprised by the findings overall or the comments about the 'lecture-like quality' of the AI 'therapist.'
'Human responses vary, even when rigidly following a CBT agenda,' he told Medscape Medical News .
'Reading about therapy, which is essentially what the AI software does, isn't comparable to a therapist who has read training materials but has incorporated the information through human interaction.'

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Dodgers' Tyler Glasnow is optimistic about recovery from shoulder injury
Dodgers' Tyler Glasnow is optimistic about recovery from shoulder injury

Associated Press

time35 minutes ago

  • Associated Press

Dodgers' Tyler Glasnow is optimistic about recovery from shoulder injury

LOS ANGELES (AP) — Tyler Glasnow is feeling optimistic about his recovery from a shoulder injury that has sidelined him since late April, one of five Los Angeles Dodgers starting pitchers who are on the injured list. There's no timetable yet for his return. The 31-year-old right-hander got hurt against Pittsburgh on April 27. He was recently moved to the 60-day IL. 'My shoulder is totally fine,' he said Tuesday. Glasnow downplayed the general body soreness that manager Dave Roberts mentioned on Monday. He said back tightness after a bullpen session prompted a precautionary decision to rest him for a few days. 'We're both on the same wavelength,' he said of himself and the team. His shoulder feels better, he said, and he plans to throw a bullpen session later this week. He played catch in the outfield Tuesday. 'I'll be back as soon as I can in a healthy way to help the team in the playoffs is my goal,' he said. Last season, Glasnow didn't pitch in the postseason because of right elbow tendinitis. He signed a $135.5 million, five-year deal before the 2024 season. Also Tuesday, the Dodgers signed right-hander José Ureña, who was designated for assignment last weekend by Toronto. He was 0-0 with a 3.65 ERA in six appearances for the Blue Jays. He began the season with the New York Mets, giving up five runs in three innings of his lone appearance. Ureña, 33, has spent 11 years in the majors with Miami, Detroit, Milwaukee, Colorado, the Chicago White Sox, Texas, the Mets and Blue Jays. He is 44-77 with a 4.78 ERA in 239 games. The Dodgers also optioned right-handed pitcher Will Klein to Triple-A Oklahoma City and designated catcher Chuckie Robinson for assignment. ___ AP MLB:

As Trump administration eases EPA regulations, Houston could pay a price
As Trump administration eases EPA regulations, Houston could pay a price

CBS News

timean hour ago

  • CBS News

As Trump administration eases EPA regulations, Houston could pay a price

Washington — Since President Trump took office in January, his Environmental Protection Agency has been both slashing and reconsidering dozens of rules designed to fight pollution. The White House is also firing many of the EPA staffers who enforce the rules that remain. This week, CBS News visited a Houston neighborhood that's near an NRG Energy coal-fired power plant, the largest in Texas. When CBS News visited the same neighborhood in December, Mr. Trump had just been elected to a second term, promising the energy industry that he would roll back environmental regulations that protect air quality. "I think of pollution as a silent and invisible killer," Dr. Winston Liaw, chair of the Health Systems and Population Health Sciences Department at the University of Houston, told CBS News. Liaw treats patients who run a higher risk of lung disease, asthma and heart attacks due to emissions from oil refineries, chemical plants and coal plants in the Houston area. He explained how the air in Houston can impact human health. "There are these tiny particles, and they're so small that they bypass a lot of our defenses," Liaw said. "And then they start injuring all sorts of tissue in our body." A 2018 study from Rice University found that pollution from the NRG plant contributes to 177 premature deaths per year. In April, the Trump administration gave 68 plants — including the NRG plant in the Rice study — a two-year exemption from complying with federal regulations intended to lower mercury emissions, a powerful toxin that can affect the brain. CBS News analyzed the Trump administration's exemptions and found that nearly 65% of these plants are located within 3 miles of low-income, minority communities. "Bottom line is, who's more at risk are poor people," said Ben Jealous, executive director of the Sierra Club, an environmental advocacy group that has led an effort to try and close almost two-thirds of the nation's coal plants. "When you start increasing production of coal-fired power plants, you're going to kill more people, and you're going to cause more heart attacks, and you're going to cause more asthma attacks," Jealous said. In a statement provided to CBS News, NRG Energy said its "coal units operate in compliance with the current Mercury Air Toxics Standards (MATS) and will operate in compliance with any future MATS requirements." In a separate statement, the Trump administration said Biden-era coal plant regulations "stacked burdensome regulations on top of the longstanding Mercury and Air Toxics Standards, raising the risk of coal-fired plants shutting down – which would eliminate thousands of jobs, strain our electrical grid, and undermine our national security by leaving America vulnerable to electricity shortages." Jealous argues that coal is not a more reliable energy source than renewable energies. "The argument that coal gives you more reliable energy isn't valid," Jealous said. "Solar, wind and batteries gives you the most reliable, the most resilient grid." More importantly, he said, for the people of Houston and across the country, renewable energy means less pollution.

What is a GPT?
What is a GPT?

Yahoo

timean hour ago

  • Yahoo

What is a GPT?

When you buy through links on our articles, Future and its syndication partners may earn a commission. The introduction of generative pre-trained transformers (GPTs) marked a significant milestone in the adoption and utility of artificial intelligence in the real world. The technology was created by the then fledgling research lab OpenAI, based on previous research done on transformers in 2017 by Google Labs. It was Google's white paper "Attention is all you need", which laid the foundation for OpenAI's work on the GPT concept. As seen in > Model matchup surprise > ChatGPT announcements > Goodbye ChatGPT-4 > Why ChatGPT 4.1 is a big deal Transformers provided AI scientists with an innovative method of taking user input, and converting it to something that could be used by the neural network using an attention mechanism to identify important parts of the data. This architecture also allows for the information to be processed in parallel rather than sequentially as with traditional neural networks. This provides a huge and critical improvement in speed and efficiency of AI processing. OpenAI's GPT architecture was released in 2018 with GPT-1. By significantly refining Google's transformer ideas, the GPT model demonstrated that large-scale unsupervised learning could produce an extremely capable text generation model which operated at vastly improved speeds. GPT's also uprated the neural networks' understanding of context which improved accuracy and provided human-like coherence. Before GPT, AI language models relied on rule-based systems or simpler neural networks like recurrent neural networks (RNNs), which struggled with long-range dependencies and contextual understanding. The story of the GPT architecture is one of constant incremental improvements ever year since launch. GPT-2 in 2019 introduced a model with 1.5 billion parameters, which started to provide the kind of fluent text responses where AI users are now familiar with. However it was the introduction of GPT-3 (and subsequently 3.5) in 2020 which was the real game-changer. It featured 175 billion parameters, and suddenly a single AI model could cope with a vast array of applications from creative writing to code generation. GPT technology went viral in November of 2022 with the launch of ChatGPT. Based on GPT 3.5 and later GPT-4, this astonishing technology instantly propelled AI into public consciousness in a massive way. Unlike previous GPT models, ChatGPT was fine-tuned for conversational interaction. Suddenly business users and ordinary citizens could use an AI for things like customer service, online tutoring or technical support. So powerful was this idea, that the product attracted a 100 million users in a mere 60 days. Today GPT is one of the top two AI system architectures in the world (along with Google's Gemini). Recent improvements have included multimodal capabilities, i.e. the ability to process not just text but also images, video and audio. OpenAI has also updated the platform to improve pattern recognition and enhance unsupervised learning, as well as adding agentic functionality via semi-autonomous tasks. On the commercial front, GPT powered applications are now deeply embedded in many different business and industry enterprises. Salesforce has Einstein GPT to deliver CRM functionality, Microsoft's Copilot is an AI assisted coding tool which incorporates Office suite automation, and there are multiple healthcare AI models which are fine-tuned to provide GPT powered diagnosis, patient interaction and medical research. At the time of writing the only two significant rivals to the GPT architecture are Google's Gemini system and the work being done by DeepSeek, Anthropic's Claude and Meta with its Llama models. The latter products also use transformers, but in a subtly different way to GPT. Google however is a dark horse in the race, as it's becoming clear that the Gemini platform has the potential to dominate the global AI arena within a few short years. Despite the competition, OpenAI remains firmly at the top of many leaderboards in terms of AI performance and benchmarks. Its growing range of reasoning models such as o1 and o3, and its superlative image generation product, GPT Image-1 which uses the technology, continue to demonstrate that there is significant life left in the architecture, waiting to be exploited.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store