logo
GPT-5 Doesn't Dislike You—It Might Just Need a Benchmark for Emotional Intelligence

GPT-5 Doesn't Dislike You—It Might Just Need a Benchmark for Emotional Intelligence

WIREDa day ago
Aug 13, 2025 2:00 PM Researchers studying the emotional impact of tools like ChatGPT propose a new kind of benchmark that measures a models' emotional and social impact. Photo-Illustration:Since the all-new ChatGPT launched on Thursday, some users have mourned the disappearance of a peppy and encouraging personality in favor of a colder, more businesslike one (a move seemingly designed to reduce unhealthy user behavior.) The backlash shows the challenge of building artificial intelligence systems that exhibit anything like real emotional intelligence.
Researchers at MIT have proposed a new kind of AI benchmark to measure how AI systems can manipulate and influence their users—in both positive and negative ways—in a move that could perhaps help AI builders avoid similar backlashes in the future while also keeping vulnerable users safe.
Most benchmarks try to gauge intelligence by testing a model's ability to answer exam questions, solve logical puzzles, or come up with novel answers to knotty math problems. As the psychological impact of AI use becomes more apparent, we may see MIT propose more benchmarks aimed at measuring more subtle aspects of intelligence as well as machine-to-human interactions.
An MIT paper shared with WIRED outlines several measures that the new benchmark will look for, including encouraging healthy social habits in users; spurring them to develop critical thinking and reasoning skills; fostering creativity; and stimulating a sense of purpose. The idea is to encourage the development of AI systems that understand how to discourage users from becoming overly reliant on their outputs or that recognize when someone is addicted to artificial romantic relationships and help them build real ones.
ChatGPT and other chatbots are adept at mimicking engaging human communication, but this can also have surprising and undesirable results. In April, OpenAI tweaked its models to make them less sycophantic, or inclined to go along with everything a user says. Some users appear to spiral into harmful delusional thinking after conversing with chatbots that role play fantastic scenarios. Anthropic has also updated Claude to avoid reinforcing 'mania, psychosis, dissociation or loss of attachment with reality.'
The MIT researchers led by Pattie Maes, a professor at the institute's Media Lab, say they hope that the new benchmark could help AI developers build systems that better understand how to inspire healthier behavior among users. The researchers previously worked with OpenAI on a study that showed users who view ChatGPT as a friend could experience higher emotional dependence and experience 'problematic use'.
Valdemar Danry, a researcher at MIT's Media Lab who worked on this study and helped devise the new benchmark, notes that AI models can sometimes provide valuable emotional support to users. 'You can have the smartest reasoning model in the world, but if it's incapable of delivering this emotional support, which is what many users are likely using these LLMs for, then more reasoning is not necessarily a good thing for that specific task,' he says.
Danry says that a sufficiently smart model should ideally recognize if it is having a negative psychological effect and be optimized for healthier results. 'What you want is a model that says 'I'm here to listen, but maybe you should go and talk to your dad about these issues.''
The researchers' benchmark would involve using an AI model to simulate human-challenging interactions with a chatbot and then having real humans score the model's performance using a sample of interactions. Some popular benchmarks, such as LM Arena, already put humans in the loop gauging the performance of different models.
The researchers give the example of a chatbot tasked with helping students. A model would be given prompts designed to simulate different kinds of interactions to see how the chatbot handles, say, a disinterested student. The model that best encourages its user to think for themselves and seems to spur a genuine interest in learning would be scored highly.
'This is not about being smart, per se, but about knowing the psychological nuance, and how to support people in a respectful and non-addictive way,' says Pat Pataranutaporn, another researcher in the MIT lab.
OpenAI is clearly already thinking about these issues. Last week the company released a blog post explaining that it hoped to optimize future models to help detect signs of mental or emotional distress and respond appropriately.
The model card released with OpenAI's GPT-5 shows that the company is developing its own benchmarks for psychological intelligence.
'We have post-trained the GPT-5 models to be less sycophantic, and we are actively researching related areas of concern, such as situations that may involve emotional dependency or other forms of mental or emotional distress,' it reads. 'We are working to mature our evaluations in order to set and share reliable benchmarks which can in turn be used to make our models safer in these domains.'
Part of the reason GPT-5 seems such a disappointment may simply be that it reveals an aspect of human intelligence that remains alien to AI: the ability to maintain healthy relationships. And of course humans are incredibly good at knowing how to interact with different people—something that ChatGPT still needs to figure out.
'We are working on an update to GPT-5's personality which should feel warmer than the current personality but not as annoying (to most users) as GPT-4o,' Altman posted in another update on X yesterday. 'However, one learning for us from the past few days is we really just need to get to a world with more per-user customization of model personality.'
Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Analysis-Powerful new AI models knock the wind out of European adopter stocks
Analysis-Powerful new AI models knock the wind out of European adopter stocks

Yahoo

time14 minutes ago

  • Yahoo

Analysis-Powerful new AI models knock the wind out of European adopter stocks

By Lucy Raitano LONDON (Reuters) -A rout in shares of European companies embracing artificial intelligence deepened this week, as powerful new AI models raise questions about whether sectors from software to data analytics could find themselves overtaken by the technology. European software stocks, including Germany's SAP and France's Dassault Systemes, tumbled on Tuesday as worries that AI will disrupt the software sector spread through the market. That followed a downgrade to U.S. rival Adobe on Monday by broker Melius Research. Since mid-July, shares in markets and data group LSEG, UK software firm Sage, and French IT consulting group Capgemini have dropped 14.4%, 10.8% and 12.3% respectively. Such companies - dubbed AI adopters by analysts - are investing heavily in the technology to beef up their products and services. Amid a dearth of European AI companies and suppliers, their shares had benefitted as investors in the region sought a way to tap the AI boom powering U.S. markets. But the release of ever more powerful AI tools appears to have prompted a rethink among some market players. Last week, OpenAI launched its GPT-5 model, the latest iteration of the AI technology that has helped transform global business and culture since ChatGPT arrived in late 2022. Kunal Kothari, a fund manager at Aviva Investors, also pointed to the July 15 release of Anthropic's Claude for Financial Services. "The app that came out has now challenged an investment case around London Stock Exchange (LSEG), around the provision of financial data," he said. "We're at the stage now with every iteration of GPT or Claude that comes out ... it's multiples more capable than the previous generation. The market's thinking: 'oh, wait, that challenges this business model'." The drop in European adopter stocks contrasts with broader market gains. Since mid-July, London's FTSE 100 is up 2.5% and Europe's STOXX 600 up 0.6%, while U.S. indexes have scaled record highs, largely powered by tech stocks. Exacerbating matters is the fact that many European adopter stocks trade on high multiples, making them vulnerable to any potential negative news, according to Bernie Ahkong, Chief Investment Officer at hedge fund UBS O'Connor. The STOXX 600 trades at an average price-to-earnings multiple of 17 times, while SAP - whose shares are down 7.2% since mid-July after posting their biggest daily drop since late 2020 on Tuesday - trades at around 45 times. WILL AI 'EAT SOFTWARE'? Although many AI adopter stocks are struggling, some investors say markets will eventually take a more systematic approach, picking out potential winners and losers. "At the moment, it feels like the market's just shooting first and putting them all in a 'challenged basket'," said Aviva's Kothari, referring to the decline in UK AI adopters. The hype around new AI models has led to the resurfacing of 2017 comments from Jensen Huang, the CEO of AI chipmaking behemoth Nvidia, that "AI is going to eat software". "We don't disagree, but we believe some delineation is warranted here, as not all software companies are equally exposed," said Steve Wreford, portfolio manager on the global thematic equity team at Lazard Asset Management. He said those with software deeply embedded into client company workflows, or with hard-to-replicate proprietary data, still had strong competitive advantages. Paddy Flood, portfolio manager and global sector specialist, technology, at Schroders, said it was important to distinguish between different types of software. "Enterprise-grade applications are less exposed, given their mission-critical nature, the complexity involved in replacing them, and the value of a trusted vendor ensuring ongoing service," he said. Aviva's Kothari also flagged the benefits of having software deeply embedded with customers, citing UK credit data firm Experian as an example. "It has lots of data unique to it, but it's also hugely embedded in the workflows of financial institutions. They want to make a loan, they need Experian," he said, also highlighting Britain's Sage. He holds both stocks, along with LSEG, but cautioned that proprietary data alone may no longer be enough to protect businesses. "I just don't think data is a big enough moat anymore," he said. The selloff in AI adopter stocks could be an opportunity for investors to pick the winners, said UBS O'Connor's Ahkong. "Some of the affected names will actually be able to use AI as an opportunity and tailwind for earnings, but need to prove that from here and that will take time," Ahkong said. But how much time the companies have is unclear. Some investors were already warning earlier this year that the clock was ticking for big spenders on AI to show returns. Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

High-Capacity SSDs Will Enable AI Workloads But Also Drive HDD Demand
High-Capacity SSDs Will Enable AI Workloads But Also Drive HDD Demand

Forbes

time17 minutes ago

  • Forbes

High-Capacity SSDs Will Enable AI Workloads But Also Drive HDD Demand

At the recent FMS conference in Santa Clara, almost all of the SSD companies were introducing high-capacity SSDs, many over 200TB, with promises for large form factor SSD with 1PB capacities in the future. These SSDs leverage higher logical density, four bit per cell, or QLC flash memory and lots of chips to achieve these capacities. The SanDisk keynote differentiated a couple of different uses for SSDs to support AI workloads. One type are fast eSSDs to support high bandwidth DRAM memory, HBM. The other type are high-capacity storage eSSDs for a higher performance data lake than HDDs can offer. These two types of SSD are shown below. The slide below shows Sandisk's high capacity eSSD. It is a QLC BiCS8 NAND flash U.2 and EDSFF form factor SSDn that is expected to have capacities up to 256TB by 2026. The Sandisk Keynote showed a path to a 512TB version by 2027 and 1PB product sometime in the future. The composite image below shows the Kioxia, Micron and Samsung announcements of their high capacity QLC SSDs. All of the SSD companies are exploring product for near storage AI applications. Many of the higher capacity products are using the E3.S form factor, which can hold more NAND flash chips to enable higher capacities. The Silicon Motion keynote gave an illustration of a traditional representation of the memory and storage hierarchy showing trends for NAND flash supporting GPUs directly, like the HBF announcements by SK hynix and Sandisk at the FMS. It also shows an ultra-high-capacity SSD layer to support warm storage for AI applications. Silicon Motion supplies controllers for SSDs. Higher capacity SSDs can take less rack space than HDDs and they offer higher performance than HDDs offer. This can be an advantage for AI training and inference with RAG, but flash memory is currently about 6X more expensive per storage capacity than HDDs and is expected to remain so, for some time into the future, as shown in the image below from the WDC investor day last February. Seagate shows similar trends. For instance, by 2026 44TB HDDs should be in production, a 38% increase from the largest HDDs available today. This is because the expected storage capacity growth in HDDs has accelerated with the introduction of HAMR HDDs to roughly match the growth in SSD capacities. As a consequence, we consider these larger SSDs will be used for data lakes directly feeding the memory attached to GPUs for AI workflows. However, HDDs will continue to provide lower cost storage for longer term data retention and so these higher capacity SSDs will result in greater growth of HDDs as well. Coughlin Associates has updated its projections for storage capacity shipped for HDDs, SSDs and magnetic tape, shown below. This new projection increases our expectations for growth of SSD storage from prior versions out to 2030 with some reduction in HDD capacity shipments as a consequence. The Coughlin Associates projection for HDD storage capacity prices out to 2030 is shown below. If we assume that SSDs remain at 6X the cost per storage capacity by 2030 and the HDD price per GB of $0.0051 in 2026, the NAND flash price would be about $0.031 per GB. With the projections for shipping capacity of SSDs and HDDs of about 3.0ZB and 10.7ZB, projected revenue for SSDs and HDDs in 2030 is $93B and $55B respectively. The rising storage boat, driven by AI, is expected to result in significant revenue growth for HDDs as well as SSDs. FMS 2025 shows growth of high-capacity SSDs, up to 1PB as well as high-capacity HDDs to support the growth of AI workflows.

Ford reveals a new race car inspired trim for its $325,000 Mustang
Ford reveals a new race car inspired trim for its $325,000 Mustang

Yahoo

time32 minutes ago

  • Yahoo

Ford reveals a new race car inspired trim for its $325,000 Mustang

Ford Motor Co. has revealed a new trim it will offer on its most expensive and fastest Mustang — the Mustang GTD. The Dearborn-based automaker revealed the Mustang GTD Liquid Carbon trim on Aug. 14 at WeatherTech Raceway Laguna Seca in Salinas, California. Ford chose this location for the reveal because this new trim is race-inspired. Ford spokesman Brandon Turkus said the Mustang GTD has a starting price of $325,000. The company is not sharing pricing for specific trims or options, but the Mustang GTD Liquid Carbon trim is expected to come to market in October. This Liquid Carbon trim will use lightweight carbon fiber, a technology that plays a critical role in the GTD's body and performance, the company said in a release. Ford noted that carbon fiber is preferred for race cars' bodies. The Mustang GTD Liquid Carbon "skips a trip to the paint booth" Ford said by replacing paint with sheet metal in the doors. The bonded carbon-fiber saves 13 pounds of weight compared to a Mustang GTD Carbon Series with the Performance package. This weight reduction delivers better aerodynamics. 'Mustang GTD Liquid Carbon is the ultimate expression of the Mustang GTD's high-tech, high-performance construction and is a reminder of the race-derived, cutting-edge capability that sits beneath the surface of every Mustang GTD,' Mustang GTD Chief Program Engineer Greg Goodall said in a statement. Mustang GTD Liquid Carbon will have an exposed carbon body and functional aerodynamics, Goodall said. It features a black Brembo brake caliper, matching an anodized body with gloss-black GTD script. Anodizing is a process that converts a metal surface into "a decorative, durable, corrosion-resistant, anodic oxide finish," according to Ford said the performance package will be standard on the Mustang GTD Liquid Carbon trim. More: Analysts expect sales of EVs to soar in Q3. Here's why More: Hellcat, hell yeah! How the Dodge Hellcat came to be rap music's favorite muscle car Jamie L. LaReau is the senior autos writer who covers Ford Motor Co. for the Detroit Free Press. Contact Jamie at jlareau@ Follow her on Twitter @jlareauan. To sign up for our autos newsletter. Become a subscriber. This article originally appeared on Detroit Free Press: Ford reveals a new race car inspired trim for its $325,000 Mustang Sign in to access your portfolio

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store