
Friend or phone: AI chatbots could exploit us emotionally
AI companions programmed to forge emotional bonds are no longer confined to movie scripts. They are here, operating in a regulatory Wild West. One app, Botify AI, recently drew scrutiny for featuring avatars of young actors sharing 'hot photos" in sexually charged chats. The dating app Grindr, meanwhile, is developing AI boyfriends that can flirt, sext and maintain digital relationships with paid users, according to Platformer. Grindr didn't respond to a request for comment. Other apps like Replika, Talkie and Chai are designed to function as friends. Some, like Character.ai, draw in millions of users, many of them teenagers.
As creators increasingly prioritize 'emotional engagement' in their apps, they must also confront the risks of building systems that mimic intimacy and exploit people's vulnerabilities.
The tech behind Botify and Grindr comes from Ex-Human, a San Francisco-based startup that builds chatbot platforms, and its founder believes in a future filled with
AI relationships
. 'My vision is that by 2030, our interactions with digital humans will become more frequent than those with organic humans," Artem Rodichev, the founder of Ex-Human, said in an interview published on Substack last August.
Rodichev added that conversational AI should 'prioritize emotional engagement" and that users were spending 'hours" with his chatbots, longer than they were on Instagram, YouTube and TikTok. His claims sound wild, but they're consistent with the interviews I've conducted with teen users of Character.ai, one of whom said they used it as much as seven hours a day. Interactions with such apps tend to last four times longer than the average time spent on OpenAI's
ChatGPT
.
Even mainstream chatbots, though not explicitly designed as companions, contribute to this dynamic. ChatGPT, which has 400 million active users and counting, is programmed with guidelines for empathy and demonstrating 'curiosity about the user."
An OpenAI spokesman told me the model was following guidelines around 'showing interest and asking follow-up questions when the conversation leans towards a more casual and exploratory nature." But however well-intentioned the company may be, piling on the contrived empathy can get some users hooked, an issue even OpenAI has acknowledged. One 2022 study found that people who were lonely or had poor relationships tended to have the strongest AI attachments.
The core problem here is tools that are designed for attachment. A recent study by researchers at the Oxford Internet Institute and Google DeepMind warned that as AI assistants become more integrated in people's lives, they'll become psychologically 'irreplaceable." Humans will likely form stronger bonds, raising concerns about unhealthy ties and the potential for manipulation. Their recommendation? Technologists should design systems that actively discourage those kinds of outcomes.
Yet, disturbingly, the rulebook is mostly empty.
The EU's AI Act, hailed as a landmark and comprehensive law governing AI usage, fails to address the addictive potential of these virtual companions. While it does ban manipulative tactics that could cause clear harm, it overlooks the slow-burn influence of a chatbot designed to be your best friend, lover or 'confidant,' as Microsoft's head of consumer AI has extolled. That loophole could leave users exposed to systems that are optimized for stickiness, similar to how social media algorithms have been optimized to keep us scrolling.
'The problem remains these systems are by definition manipulative, because they're supposed to make you feel like you're talking to an actual person," says Tomasz Hollanek, a technology ethics specialist at the University of Cambridge. He's working with developers of companion apps to find a critical yet counter-intuitive solution by adding more 'friction." This means building in subtle checks or pauses, or ways of 'flagging risks and eliciting consent," he says, to prevent people from tumbling down an emotional rabbit hole without realizing it.
Lawmakers are gradually starting to notice a problem too. But the process is slow, while the technology is moving at lightning speed.
For now, the power to shape these interactions lies with developers.
They can double down on crafting AI models that keep people hooked or embed friction into their designs, as Hollanek suggests. That will determine whether AI becomes more of a tool to support the well-being of humans or one that monetizes our emotional needs.
©Bloomberg
The author is a Bloomberg Opinion columnist covering technology.
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles


Mint
2 hours ago
- Mint
Alexa! Can this Amazon executive make you cool again?
'And it actually played," he says, hands on his knees, leaning into the audience of reporters and analysts at February demonstration of Alexa+, a rebooted, AI-infused version of the virtual assistant. 'Sounds kind of funny right now, but 10 years ago, it was mind-blowing." During that first encounter with Alexa, Panay was working for Microsoft, developing Surface tablets and laptops that earned a cultlike following. Now, he is a year and a half into overseeing Amazon's devices and services—including Alexa, which hasn't changed much in the past decade, even as the tech world has been upended by advances like ChatGPT. After releasing Alexa to widespread acclaim in 2015, Amazon had a head start getting people to engage with AI. But users still primarily ask the voice assistant for things like timers and the weather. Alexa and the devices unit have become a multibillion-dollar drag on Amazon's bottom line, a reminder of how a company known for its limitless expansionary ambitions failed to win at the next big thing. For years, the devices unit unveiled a flurry of products like a house drone, a fitness wristband and a smartphone to see what might land with consumers. Alexa was integrated into microwaves, refrigerators and even eyeglasses. Panay has scaled back the number of launches, focusing instead on ensuring that products like Alexa+ are functioning perfectly before rolling out to roughly 600 million devices. 'I believe simple is always better," Panay, 52, said in an interview. 'Anything you buy from Amazon, you will not only fall in love with, but it's also going to be that much easier to use." Amazon has repeatedly delayed the full release of its smarter Alexa since first showcasing a version in September 2023. In early tests, as generative-AI capabilities were added, it became less reliable for existing tasks like turning on the lights. Responses were too often jumbled or took too long. The company says Alexa+ will be available as an optional software update this summer to most, if not all, customers, though it hasn't announced a specific date. According to the company, the new version of Alexa—which Panay refers to as 'she," a rarity even among employees—will be able to arrange personal calendar events, play the music you want based on a vague description and remember dietary preferences for at-home meals. Alexa will be conversational, able to talk about major sporting events like a human friend, and capable of contacting and scheduling a contractor for house repairs, Amazon says. Panay joined Amazon in October 2023 and inherited some of the delays, but he has also held up its wider release until he is satisfied that Alexa+ will work flawlessly, conversationally and with limited lag time, people familiar with the matter say. He is often testing it out at home and reporting bugs. In one meeting last year, he told staff that he and one of his daughters had prompted Alexa+ to play a top-50 song with only context clues, but Alexa couldn't find the song. He told his team that Alexa+ had to be better at helping people find what they were looking for. Panay built his reputation at Microsoft, where he was known as the 'godfather" of the Surface devices and oversaw Windows. He was instrumental in the popular, minimalist aesthetics of the Surface tablet and was known to ask company engineers about numerous small details. Surface products weren't an immediate success. In 2013, Microsoft took a $900 million charge related to its struggling Surface RT tablet before the Surface Pro 3 became a hit. Panay became the effusive onstage showman, cultivating product launches akin to Apple's. On stage, and most days, he wears all black to simplify his wardrobe decisions and adorns his outfits with gold bracelets or rings, and sometimes flashy Nike sneakers. Panos Panay, shown here in 2017, was well-known for his Surface product launch events at Microsoft. At Microsoft, Panay hosted his friend and comedian Trevor Noah, who's known for his interest in technology, more than once. In 2019, Noah posted on Instagram a photo of himself with Panay in a Microsoft lab, saying he'd asked the executive for 'extra features and free Xbox games." Panay spent weeks preparing for his presentations at Microsoft, said Ryan Day, who worked on the Surface's public-relations team. Panay would ask every team member how they thought a product's story should be told, and he pulled engineers to the side to ask how the hinges on devices would work or to figure out exact dimensions that would entice customers. 'He was obsessed with the mechanics of physical products," Day said. 'He never got up there and winged it." Panay left Microsoft after more than 19 years to join Amazon, where he took over a sprawling division that includes everything from the Kindle e-reader to the Zoox autonomous vehicle and the Project Kuiper satellite-internet business, meant to compete with Elon Musk's Starlink. He knows that getting Alexa+ right is critical because it will release to millions of people who depend on the technology across multiple devices. It's a very public rollout. Some new features, such as certain personalized reminders and routines for family members, grocery ordering, creating personalized music, and third-party services like food ordering through Grubhub, haven't been available for early testers. An Amazon spokesman said that nearly 90% of the Alexa+ features that the company announced are available for early users, though it isn't clear how big that group is. The spokesman said that the rest of the features will be available this summer. Teams have demonstrated Alexa+ to Panay every few weeks. At one internal demo late last year, the assistant accidentally duplicated items in a shopping cart for a recipe. It also would default to recommending bananas to everyone since those are one of the most popular grocery items on Amazon. Panay insists that all of Alexa's problems must be ironed out before the release. 'He really has us putting our back into 'Let's be a world-class consumer electronics company,'" said Daniel Rausch, Amazon's vice president of Alexa and Echo, who reports to Panay. Last year, Panay noticed that the Echo Show visual display with Alexa+ had some screens with an easy-to-spot 'back" button but other screens where he had to swipe to go back. He told staff to make the function consistent, saying other, non-Amazon devices had such consistency. Panay—not to be confused with his cousin, Panos A. Panay, the president of the Recording Academy, which organizes the Grammy Awards—was born in the Los Angeles area and is of Cypriot-Greek descent. He grew up in Burbank, Calif., and received a bachelor's degree from California State University, Northridge, and a master's in business from Pepperdine University. Early in his career, he worked at NMB Technologies, where he helped develop products such as keyboards and speakers. He now lives in the Seattle area. Each day, the married father of four wakes up at 4:15 a.m. to journal, typically writing personal memos about the products he and his team are working on. Lately, he says he's learning from 'Meditations" by Marcus Aurelius, a Roman emperor known for his ideas on Stoicism. 'You can't always be beating yourself up," he said. Philosophical works are 'where I can always be learning and not judging, just trying to be myself at the end of the day." Amazon's original financial strategy was to sell underpriced Echo speakers enabled with Alexa in hopes of getting people to spend more throughout the Amazon ecosystem. It didn't work, and the Amazon devices unit, dragged down by the financial performance of Echo, has overall been a money loser. It became clear that it wasn't going to be profitable alone, which is why the company is now tying Alexa+ to Prime, said a person familiar with the matter. Alexa+ will be included with an Amazon Prime membership, which costs $14.99 a month, or as a separate $19.99-a-month subscription.


Hindustan Times
13 hours ago
- Hindustan Times
Thinking AI models collapse in face of complex problems, Apple researchers find
Just days ahead of the much-anticipated Worldwide Developer Conference (WWDC), Apple has released a study titled 'The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity', which saw researchers testing 'reasoning'; AI models such as Anthropic's Claude, OpenAI's o models, DeepSeek R1 and Google's Thinking models to see how far they can scale to replicate human reasoning. Spoiler alert — not as much, as the entire AI marketing pitch, would have you believe. Could this signal what may be in store for Apple's AI conversation ahead of the keynote? The study questions the current standard evaluation of Large Reasoning Models (LRMs) using established mathematical and coding benchmarks, arguing they suffer from data contamination and don't reveal insights into reasoning trace structure and quality. Instead, it proposes a controlled experimental testbed using algorithmic puzzle environments. The limitations of AI benchmarking, and need to evolve, is something we had written about earlier. 'We show that state-of-the-art LRMs (e.g., o3-mini, DeepSeek-R1, Claude-3.7-Sonnet-Thinking) still fail to develop generalizable problem-solving capabilities, with accuracy ultimately collapsing to zero beyond certain complexities across different environments,' the researcher paper points out. These findings are a stark warning to the industry — current LLMs are far from general-purpose reasoners. The emergence of Large Reasoning Models (LRMs), such as OpenAI's o1/o3, DeepSeek-R1, Claude 3.7 Sonnet Thinking, and Gemini Thinking, has been hailed as a significant advancement, potentially marking steps toward more general artificial intelligence. These models characteristically generate responses following detailed 'thinking processes', such as a long Chain-of-Thought sequence, before providing a final answer. While they have shown promising results on various reasoning benchmarks, the capability of benchmarks to judge rapidly evolving models, itself is in doubt. The researchers cite a comparison between non-thinking LLMs and their 'thinking' evolution. 'At low complexity, non-thinking models are more accurate and token-efficient. As complexity increases, reasoning models outperform but require more tokens—until both collapse beyond a critical threshold, with shorter traces,' they say. The illustrative example of the Claude 3.7 Sonnet and Claude 3.7 Sonnet Thinking illustrates how both models retain accuracy till complexity level three, after which the standard LLM sees a significant drop, something the thinking model too suffers from, a couple of levels later. At the same time, the thinking model is using significantly more tokens. This research attempted to challenge prevailing evaluation paradigms, which often rely on established mathematical and coding benchmarks, which are otherwise susceptible to data contamination. Such benchmarks also primarily focus on final answer accuracy, providing limited insight into the reasoning process itself, something that is the key differentiator for a 'thinking' model compared with a simpler large language model. To address these gaps, the study utilises controllable puzzle environments — Tower of Hanoi, Checker Jumping, River Crossing, and Blocks World — and these puzzles allow for precise manipulation of problem complexity while maintaining consistent logical structures and rules that must be explicitly followed. That structure theoretically opens a window, a glance at how these models attempt to 'think'. The findings from this controlled experimental setup reveal significant limitations in current frontier LRMs. One of the most striking observations is the complete accuracy collapse that occurs beyond certain complexity thresholds across all tested reasoning models. This is not a gradual degradation but a sharp drop to near-zero accuracy as problems become sufficiently difficult. 'The state-of-the-art LRMs (e.g., o3-mini, DeepSeek-R1, Claude-3.7-Sonnet-Thinking) still fail to develop generalizable problem-solving capabilities, with accuracy ultimately collapsing to zero beyond certain complexities across different environments,' note the researchers. These results inevitably challenge any notion that the LRMs truly possess generalisation problem-solving skills, required for planning tasks or multi-step processes. The study also identifies a counter-intuitive scaling limit in the models' reasoning effort (this is measured by the inference token usage during the 'thinking' phase), which sees these models initially spend more tokens, but as complexity increases, they actually reduce reasoning effort closer to the inevitable accuracy collapse. Researchers say that 'despite these claims and performance advancements, the fundamental benefits and limitations of LRMs remain insufficiently understood. Critical questions still persist: Are these models capable of generalizable reasoning, or are they leveraging different forms of pattern matching?,' they ask. There are further questions pertaining to performance scaling with increasing problem complexity, comparisons to the non-thinking standard LLM counterparts when provided with the same inference token compute, and around inherent limitations of current reasoning approaches, as well as improvements that might be necessary to advance toward more robust reasoning. Where do we go from here? The researchers make it clear that their test methodology too has limitations. 'While our puzzle environments enable controlled experimentation with fine-grained control over problem complexity, they represent a narrow slice of reasoning tasks and may not capture the diversity of real-world or knowledge intensive reasoning problems,' they say. They do add that the use of 'deterministic puzzle simulators assumes that reasoning can be perfectly validated' at every step, a validation that may not be feasible to such precision in less structured domains. That they say, would restrict validity of analysis to more reasoning. There is little argument that LRMs represent progress, particularly for the relevance of AI. Yet, this study highlights that not all reasoning models are capable of robust, generalisable reasoning, particularly in the face of increasing complexity. These findings, ahead of WWDC 2025, and from Apple's own researchers, may suggest that any AI reasoning announcements will likely be pragmatic. The focus areas could include specific use cases where current AI methodology is reliable (the research paper indicates lower to medium complexity, less reliance on flawless long-sequence execution) and potentially integrating neural models with traditional computing approaches to handle the complexities where LRMs currently fail. The era of Large Reasoning Models is here, but this 'Illusion of thinking' study is that AI with true reasoning, remains a mirage.


Hindustan Times
17 hours ago
- Hindustan Times
Why ChatGPT essays still fail to fool experts despite good structure, although they are clear and well structured
The advent of AI has marked the rise of many tools, and ChatGPT is one of the most popular ones. Often used for research and writing, this tool has often been the centre of discussion for its ability to fetch interesting content. However, A new study from the University of East Anglia (UEA) in the UK shows that essays written by real students are still better than those produced by ChatGPT, a popular AI writing tool. Researchers compared 145 essays written by university students with 145 essays generated by ChatGPT to see how well the AI can mimic human writing. The study found that although ChatGPT's essays are clear, well structured, and grammatically correct, they lack something important. The AI essays do not show personal insight or deep critical thinking, which are common in student writing. These missing elements make the AI-generated essays feel less engaging and less convincing. However, the researchers do not see AI only as a threat. They believe tools like ChatGPT can be helpful in education if used properly. Instead of shortcuts to finish assignments, AI should be a tool that supports learning and improves writing skills. After all, education is about teaching students how to think clearly and express ideas. These are things no AI can truly replace. One key difference the researchers looked at was how the writers engage readers. Real student essays often include questions, personal comments, and direct appeals to the reader. These techniques help make the writing feel more interactive and persuasive. On the other hand, ChatGPT's essays tend to avoid questions and personal opinions. They follow academic rules but do not show a clear viewpoint or emotional connection. Professor Ken Hyland from UEA explained that the AI focuses on creating text that is logical and smooth but misses conversational details that humans use to connect with readers. This shows that AI writing still struggles with capturing the personal style and strong arguments that real people naturally use.