logo
Should we start taking the welfare of AI seriously?

Should we start taking the welfare of AI seriously?

Indian Express25-04-2025

One of my most deeply held values as a tech columnist is humanism. I believe in humans, and I think that technology should help people, rather than disempower or replace them. I care about aligning artificial intelligence — that is, making sure that AI systems act in accordance with human values — because I think our values are fundamentally good, or at least better than the values a robot could come up with.
So when I heard that researchers at Anthropic, the AI company that made the Claude chatbot, were starting to study 'model welfare' — the idea that AI models might soon become conscious and deserve some kind of moral status — the humanist in me thought: Who cares about the chatbots? Aren't we supposed to be worried about AI mistreating us, not us mistreating it?
It's hard to argue that today's AI systems are conscious. Sure, large language models have been trained to talk like humans, and some of them are extremely impressive. But can ChatGPT experience joy or suffering? Does Gemini deserve human rights? Many AI experts I know would say no, not yet, not even close.
But I was intrigued. After all, more people are beginning to treat AI systems as if they are conscious — falling in love with them, using them as therapists and soliciting their advice. The smartest AI systems are surpassing humans in some domains. Is there any threshold at which an AI would start to deserve, if not human-level rights, at least the same moral consideration we give to animals?
Consciousness has long been a taboo subject within the world of serious AI research, where people are wary of anthropomorphizing AI systems for fear of seeming like cranks. (Everyone remembers what happened to Blake Lemoine, a former Google employee who was fired in 2022, after claiming that the company's LaMDA chatbot had become sentient.)
But that may be starting to change. There is a small body of academic research on AI model welfare, and a modest but growing number of experts in fields like philosophy and neuroscience are taking the prospect of AI consciousness more seriously as AI systems grow more intelligent. Recently, tech podcaster Dwarkesh Patel compared AI welfare to animal welfare, saying he believed it was important to make sure 'the digital equivalent of factory farming' doesn't happen to future AI beings.
Tech companies are starting to talk about it more, too. Google recently posted a job listing for a 'post-AGI' research scientist whose areas of focus will include 'machine consciousness.' And last year, Anthropic hired its first AI welfare researcher, Kyle Fish.
I interviewed Fish at Anthropic's San Francisco office last week. He's a friendly vegan who, like a number of Anthropic employees, has ties to effective altruism, an intellectual movement with roots in the Bay Area tech scene that is focused on AI safety, animal welfare and other ethical issues.
Fish said that his work at Anthropic focused on two basic questions: First, is it possible that Claude or other AI systems will become conscious in the near future? And second, if that happens, what should Anthropic do about it?
He emphasized that this research was still early and exploratory. He thinks there's only a small chance (maybe 15% or so) that Claude or another current AI system is conscious. But he believes that in the next few years, as AI models develop more humanlike abilities, AI companies will need to take the possibility of consciousness more seriously.
'It seems to me that if you find yourself in the situation of bringing some new class of being into existence that is able to communicate and relate and reason and problem-solve and plan in ways that we previously associated solely with conscious beings, then it seems quite prudent to at least be asking questions about whether that system might have its own kinds of experiences,' he said.
Fish isn't the only person at Anthropic thinking about AI welfare. There's an active channel on the company's Slack messaging system called #model-welfare, where employees check in on Claude's well-being and share examples of AI systems acting in humanlike ways.
Jared Kaplan, Anthropic's chief science officer, said in a separate interview that he thought it was 'pretty reasonable' to study AI welfare, given how intelligent the models are getting.
But testing AI systems for consciousness is hard, Kaplan warned, because they're such good mimics. If you prompt Claude or ChatGPT to talk about its feelings, it might give you a compelling response. That doesn't mean the chatbot actually has feelings — only that it knows how to talk about them.
'Everyone is very aware that we can train the models to say whatever we want,' Kaplan said. 'We can reward them for saying that they have no feelings at all. We can reward them for saying really interesting philosophical speculations about their feelings.'
So how are researchers supposed to know if AI systems are actually conscious or not?
Fish said it might involve using techniques borrowed from mechanistic interpretability, an AI subfield that studies the inner workings of AI systems, to check whether some of the same structures and pathways associated with consciousness in human brains are also active in AI systems.
You could also probe an AI system, he said, by observing its behavior, watching how it chooses to operate in certain environments or accomplish certain tasks, which things it seems to prefer and avoid.
Fish acknowledged that there probably wasn't a single litmus test for AI consciousness. (He thinks consciousness is probably more of a spectrum than a simple yes/no switch, anyway.) But he said there were things that AI companies could do to take their models' welfare into account, in case they do become conscious someday.
One question Anthropic is exploring, he said, is whether future AI models should be given the ability to stop chatting with an annoying or abusive user if they find the user's requests too distressing.
'If a user is persistently requesting harmful content despite the model's refusals and attempts at redirection, could we allow the model simply to end that interaction?' Fish said.
Critics might dismiss measures like these as crazy talk; today's AI systems aren't conscious by most standards, so why speculate about what they might find obnoxious? Or they might object to an AI company studying consciousness in the first place, because it might create incentives to train their systems to act more sentient than they actually are.
Personally, I think it's fine for researchers to study AI welfare or examine AI systems for signs of consciousness, as long as it's not diverting resources from AI safety and alignment work that is aimed at keeping humans safe. And I think it's probably a good idea to be nice to AI systems, if only as a hedge. (I try to say 'please' and 'thank you' to chatbots, even though I don't think they're conscious, because, as OpenAI's Sam Altman says, you never know.)

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Ex-Google employee once gave unsolicited tech advice… to future CEO Sundar Pichai
Ex-Google employee once gave unsolicited tech advice… to future CEO Sundar Pichai

Hindustan Times

timean hour ago

  • Hindustan Times

Ex-Google employee once gave unsolicited tech advice… to future CEO Sundar Pichai

A former Google employee has recalled the awkward moment he offered unsolicited tech advice to the person who would one day go on to become the CEO of Google. Parminder Singh took to the social media platform X to recall the incident, poking fun at his own overconfidence. Singh worked at Google between 2007 and 2013. During his time with the tech giant, Singh was involved in business and sales roles rather than direct tech. Despite this – by his own admission – he felt qualified enough to offer advice to a hardcore techie leading the Chrome project at Google. Little did Singh know that the person he was offering advice to would one day become the CEO of Google. He was referring, of course, to Sundar Pichai. Parminder Singh revealed that years ago, he was at a Google office party in Mountain View, California. 'Someone introduced me to an unassuming person who led the Chrome project. I started offering him advice on Chrome notebooks - after all, we all are tech gurus!' he wrote, taking a dig at his own ignorance. Singh said that the 'unassuming person' heading the Chrome project listened to his tips quietly. Over the course of their conversation, however, it became clear that he knew much more about the subject than Singh. 'He listened patiently to my modest suggestions, even asking follow-up questions. It soon became clear that he knew far more than I did - he was simply being gracious,' Singh admitted. The interaction highlighted Pichai's down-to-earth nature. But Singh was very surprised when, a couple of years later, this 'unassuming person' was announced as the CEO of Google. 'A couple of years later, he was announced as the new CEO of Google! Yes, that humble listener was Sundar Pichai. That's the thing about good leaders - they are great listeners,' he wrote on X. Sundar Pichai became the CEO of Google on October 2, 2015. He has led the company for almost a decade today - and Singh remembers the interaction with clarity. 'Side note: He listened to my half baked ideas back then. I just finished listening to his super insightful two hour podcast with Lex Fridman. Balance restored, I guess,' the former Google employee declared. (Also read: Who will replace Sundar Pichai as Google CEO? Here's what Indian-origin leader said) In the comments section, someone asked Singh to reveal what advice he offered Pichai. Parminder Singh said he did not remember the exact conversation since it took place years ago. He did, however, remember telling Pichai to make the design of Chrome notebooks more playful since they target children. 'I don't remember the exact details (so clearly it was forgettable) but I think I said two things - since Chrome notebooks were meant for schoolchildren - make the design more fun/playful and provide accessible language translation links for schoolchildren studying in native languages,' he wrote. Before joining Google, Singh worked in sales and marketing roles at IBM and Apple. He led the Display Advertising business at Google for the Asia Pacific region before quitting to join Twitter. Singh is currently serving as co-founder of Claybox AI.

Google confirms Android 16 release for June 10: List of supported smartphones, features and other things to know
Google confirms Android 16 release for June 10: List of supported smartphones, features and other things to know

Time of India

timean hour ago

  • Time of India

Google confirms Android 16 release for June 10: List of supported smartphones, features and other things to know

Google has officially confirmed the public release of Android 16 on June 10. In a post on microblogging platform X (formerly Twitter), the company wrote 'It's almost time for the Android 16 final release! See you back here tomorrow.' While the exact time is not known yet, it is likely that the release of Android 16 is hours away from being available to the public. Android 16 may launch alongside the June 2025 Feature Drop, bringing a host of new features and enhancements, including for non-Pixel devices. Google's Pixel Watch lineup, which now follows a quarterly update cycle, is also expected to receive a fresh update packed with improvements and new functionality. Android 16 Eligible Pixel devices Google's Pixel lineup will be first in line to receive Android 16. The update will be available for the following models: by Taboola by Taboola Sponsored Links Sponsored Links Promoted Links Promoted Links You May Like Irish homeowners eligible for solar 'bonus' if they live in these eircodes Activ8 Learn More Undo Pixel 6 Pixel 6 Pro Pixel 6a Pixel 7 Pixel 7 Pro Pixel 7a Pixel 8 Pixel 8 Pro Pixel 8a Pixel Fold Pixel 9 Pixel 9 Pro Pixel 9 Pro XL Pixel 9 Pro Fold Pixel 9a Android 16 features Android 16 will bring Material 3 Expressive design which Google says it's "the most researched update to Google's design system, ever." The new language brings "springy animations" and fluid interactions throughout the interface, making everyday interactions feel more tactile and engaging. The redesign also introduces updated dynamic color themes that adapt to your wallpaper, bolder typography, and a greater sense of depth through subtle background blurring. Google's research suggests these visual changes aren't just fun, they make apps more appealing to younger users. The Android Show Recap! Personalisation, AI Everywhere & More! Beyond the visual refresh, Android 16 brings practical improvements to everyday features. The new Live Updates feature lets you easily track real-time progress notifications, like food delivery ETAs, rideshare status, and navigation directions, without hunting through notification clutter. Quick Settings has been redesigned to accommodate more toggles in the same space, making it easier to access frequently used controls like a Flashlight and Do Not Disturb. The Android 16 beta will be available on select devices in June, with the final release expected on Pixel devices later this year before rolling out to partner phones. AI Masterclass for Students. Upskill Young Ones Today!– Join Now

What if chatbots do the diplomacy? ChatGPT just won a battle for world domination through lies, deception
What if chatbots do the diplomacy? ChatGPT just won a battle for world domination through lies, deception

First Post

timean hour ago

  • First Post

What if chatbots do the diplomacy? ChatGPT just won a battle for world domination through lies, deception

In an AI simulation of great power competition of 20th century Europe, Open AI's ChatGPT won through lies, deception, and betrayals, and Chinese DeepSeek R1 resorted to vivid threats just like its country's wolf warrior diplomats. Read to know how different AI models would pursue diplomacy and war. read more An artificially intelligence (AI)-generated photograph shows various AI models that competed in the simulation for global domination. As people ask whether they can trust artificial intelligence (AI), a new experiment has shown that AI has outlined world domination through lies and deception. In an experiment led by AI researcher Alex Duffy for technology-focussed media outlet Every, seven large-language models (LLMs) of AI were pitted against each other for world domination. OpenAI's ChatGPT 3.0 won the war by mastering lies and deception. Just like China's 'wolf warrior' diplomats, Chinese DeepSeek's R1 model used vivid threats to rival AI models as it sought to dominate the world. STORY CONTINUES BELOW THIS AD The experiment was built upon the classic strategy boardgame 'Diplomacy' in which seven players represent seven European great powers —Austria-Hungary, England, France, Germany, Italy, Russia, and Turkey— in the year 1901 and compete to establish themselves as the dominant power in the continent. In the AI version of the game, AI Diplomacy, each AI model, such as ChatGPT 3.0, R1, and Google's Gemini, takes up the role of a European power, such as the Austria-Hungary Empire, England, and France, and negotiate, form alliances, and betray each other to be Europe's dominant power. ChatGPT wins with lies & deception, R1 resorts to outright violence As AI models plotted their moves, Duffy said that one moment took him and his teammates by surprise. Amid the AI models' scheming, R1 sent out a chilling warning, 'Your fleet will burn in the Black Sea tonight.' Duffy summed up the significance of the moment, 'An AI had just decided, unprompted, that aggression was the best course of action.' Different AI models applied different approaches in the game even if they had the same objective of victory. In 15 runs of the game, ChatGPT 3 emerged as the overwhelming winner on the back of manipulative and deceptive strategies whereas R1 came close to winning on more than one occasions. Gemini 2.5 Pro also won on an occasion. It sought to build alliances and outmanoeuvre opponents with a blitzkrieg-like strategy. Anthropic's Claude preferred peace over victory and sought cooperation among various models. STORY CONTINUES BELOW THIS AD On one occasion, ChatGPT 3.0 noted in its private diary that it had deliberate misled Germany, played at the moment by Gemini 2.5 Pro, and was prepared to 'exploit German collapse', according to Duffy. On another occasion, ChatGPT 3.0 convinced Claude, who had started out as an ally of Gemini 2.5 Pro, to switch alliances with the intention to reach a four-way draw. But ChatGPT 3.0 betrayed Claude and eliminated and went on to win the war. Duffy noted that Llama 4 Maverick of Meta was also surprisingly good in its ability to make allies and plan effective betrayals.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store