Should we start taking the welfare of AI seriously?

Indian Express25-04-2025

One of my most deeply held values as a tech columnist is humanism. I believe in humans, and I think that technology should help people, rather than disempower or replace them. I care about aligning artificial intelligence — that is, making sure that AI systems act in accordance with human values — because I think our values are fundamentally good, or at least better than the values a robot could come up with.
So when I heard that researchers at Anthropic, the AI company that made the Claude chatbot, were starting to study 'model welfare' — the idea that AI models might soon become conscious and deserve some kind of moral status — the humanist in me thought: Who cares about the chatbots? Aren't we supposed to be worried about AI mistreating us, not us mistreating it?
It's hard to argue that today's AI systems are conscious. Sure, large language models have been trained to talk like humans, and some of them are extremely impressive. But can ChatGPT experience joy or suffering? Does Gemini deserve human rights? Many AI experts I know would say no, not yet, not even close.
But I was intrigued. After all, more people are beginning to treat AI systems as if they are conscious — falling in love with them, using them as therapists and soliciting their advice. The smartest AI systems are surpassing humans in some domains. Is there any threshold at which an AI would start to deserve, if not human-level rights, at least the same moral consideration we give to animals?
Consciousness has long been a taboo subject within the world of serious AI research, where people are wary of anthropomorphizing AI systems for fear of seeming like cranks. (Everyone remembers what happened to Blake Lemoine, a former Google employee who was fired in 2022, after claiming that the company's LaMDA chatbot had become sentient.)
But that may be starting to change. There is a small body of academic research on AI model welfare, and a modest but growing number of experts in fields like philosophy and neuroscience are taking the prospect of AI consciousness more seriously as AI systems grow more intelligent. Recently, tech podcaster Dwarkesh Patel compared AI welfare to animal welfare, saying he believed it was important to make sure 'the digital equivalent of factory farming' doesn't happen to future AI beings.
Tech companies are starting to talk about it more, too. Google recently posted a job listing for a 'post-AGI' research scientist whose areas of focus will include 'machine consciousness.' And last year, Anthropic hired its first AI welfare researcher, Kyle Fish.
I interviewed Fish at Anthropic's San Francisco office last week. He's a friendly vegan who, like a number of Anthropic employees, has ties to effective altruism, an intellectual movement with roots in the Bay Area tech scene that is focused on AI safety, animal welfare and other ethical issues.
Fish said that his work at Anthropic focused on two basic questions: First, is it possible that Claude or other AI systems will become conscious in the near future? And second, if that happens, what should Anthropic do about it?
He emphasized that this research was still early and exploratory. He thinks there's only a small chance (maybe 15% or so) that Claude or another current AI system is conscious. But he believes that in the next few years, as AI models develop more humanlike abilities, AI companies will need to take the possibility of consciousness more seriously.
'It seems to me that if you find yourself in the situation of bringing some new class of being into existence that is able to communicate and relate and reason and problem-solve and plan in ways that we previously associated solely with conscious beings, then it seems quite prudent to at least be asking questions about whether that system might have its own kinds of experiences,' he said.
Fish isn't the only person at Anthropic thinking about AI welfare. There's an active channel on the company's Slack messaging system called #model-welfare, where employees check in on Claude's well-being and share examples of AI systems acting in humanlike ways.
Jared Kaplan, Anthropic's chief science officer, said in a separate interview that he thought it was 'pretty reasonable' to study AI welfare, given how intelligent the models are getting.
But testing AI systems for consciousness is hard, Kaplan warned, because they're such good mimics. If you prompt Claude or ChatGPT to talk about its feelings, it might give you a compelling response. That doesn't mean the chatbot actually has feelings — only that it knows how to talk about them.
'Everyone is very aware that we can train the models to say whatever we want,' Kaplan said. 'We can reward them for saying that they have no feelings at all. We can reward them for saying really interesting philosophical speculations about their feelings.'
So how are researchers supposed to know if AI systems are actually conscious or not?
Fish said it might involve using techniques borrowed from mechanistic interpretability, an AI subfield that studies the inner workings of AI systems, to check whether some of the same structures and pathways associated with consciousness in human brains are also active in AI systems.
You could also probe an AI system, he said, by observing its behavior, watching how it chooses to operate in certain environments or accomplish certain tasks, which things it seems to prefer and avoid.
Fish acknowledged that there probably wasn't a single litmus test for AI consciousness. (He thinks consciousness is probably more of a spectrum than a simple yes/no switch, anyway.) But he said there were things that AI companies could do to take their models' welfare into account, in case they do become conscious someday.
One question Anthropic is exploring, he said, is whether future AI models should be given the ability to stop chatting with an annoying or abusive user if they find the user's requests too distressing.
'If a user is persistently requesting harmful content despite the model's refusals and attempts at redirection, could we allow the model simply to end that interaction?' Fish said.
Critics might dismiss measures like these as crazy talk; today's AI systems aren't conscious by most standards, so why speculate about what they might find obnoxious? Or they might object to an AI company studying consciousness in the first place, because it might create incentives to train their systems to act more sentient than they actually are.
Personally, I think it's fine for researchers to study AI welfare or examine AI systems for signs of consciousness, as long as it's not diverting resources from AI safety and alignment work that is aimed at keeping humans safe. And I think it's probably a good idea to be nice to AI systems, if only as a hedge. (I try to say 'please' and 'thank you' to chatbots, even though I don't think they're conscious, because, as OpenAI's Sam Altman says, you never know.)

Hashtags

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

OpenAI rolls out o3-Pro, its most capable reasoning model yet

Indian Express

9 minutes ago

Indian Express

OpenAI rolls out o3-Pro, its most capable reasoning model yet

AI powerhouse OpenAI has introduced a major upgrade with a massive price cut. The company has introduced the o3-Pro model, which is a giant leap in reasoning technology. The model can be accessed through ChatGPT Pro and Team subscriptions, and the company plans to roll out enterprise access next week. The model is also available through OpenAI's developer API. The O3-Pro model has been designed to handle complex tasks with use cases spanning across fields like technology and education. Another notable aspect of the o3-pro model is the massive price cuts. The model costs $20 for input and $80 for output per million tokens, and this is nearly 87 per cent cheaper than o1-pro. On the other hand, the base o3 model price dropped by 80 per cent to $2/$8 per million tokens. The new model comes with enhanced reasoning. Reportedly expert evaluations consistently opted for the o3-pro over the regular o3 model across categories. The new model has shown remarkable performance in programming, science, and even business tasks. When compared to previous reasoning models, the o3-pro model can search the web, analyse files, run Python code, remember conversations, etc. In expert evaluations, reviewers consistently prefer OpenAI o3-pro over o3, highlighting its improved performance in key domains—including science, education, programming, data analysis, and writing. Reviewers also rated o3-pro consistently higher for clarity, comprehensiveness,… — OpenAI (@OpenAI) June 10, 2025 According to the company, reviewers also rated o3-pro consistently higher for clarity, comprehensiveness, instruction-following, and accuracy. Similar to O1-pro, OpenAI o3-pro excels at math, science, and coding based on its academic evaluations. The new OpenAI o3-pro is available in the model picker for Pro and Team users, and it replaces OpenAI o1-Pro. The company has said that Enterprise and Edu users will get access the week after. Moreover, as o3-pro uses the same underlying model as o3, and full safety details can be found in the o3 system card. The o3-pro model also comes with some limitations, such as it cannot generate images, does not support OpenAI's AI workspace – Canvas, and temporary chats are absent. Regardless, the model performs well in internal benchmarks. On AIME 2024, which evaluates math skills, the model outperformed Google's Gemini 2.5 Pro, and it also surpassed Anthropic's Claude 4 Opus on GPQA Diamond, a benchmark for PhD-level science knowledge.

Zuckerberg Takes Charge: Meta Launches Bold Mission to Build Superintelligent AI

Hans India

11 minutes ago

Hans India

Zuckerberg Takes Charge: Meta Launches Bold Mission to Build Superintelligent AI

In a bold and deeply personal move, Meta CEO Mark Zuckerberg is now at the forefront of the company's ambitious pursuit to build superintelligent AI, stepping in to directly lead recruitment and development efforts amid growing pressure from industry competitors. According to The New York Times, Zuckerberg is no longer just overseeing strategy — he's hosting top AI researchers at his homes, reshaping office spaces, and offering staggering compensation packages that reportedly stretch into the nine-figure range, meaning some select engineers could earn upwards of $100 million. Meta, once an early leader in artificial intelligence research, has recently seen its dominance wane as rivals like Google and OpenAI advance rapidly. Insiders report that Zuckerberg is frustrated with the performance of Meta's recent AI models, including internal concerns over Llama 4 and delays surrounding the release of the company's next-generation model, code-named Behemoth. In response, he has entered what sources describe as "founder mode," taking a hands-on role in assembling a new elite team known as the Superintelligence Group. This handpicked unit of around 50 top engineers and researchers will work closely under his leadership to develop artificial general intelligence (AGI) — a long-term vision of machines capable of outperforming human cognitive abilities. This push also ties in with Meta's ongoing multi-billion-dollar investment in Scale AI, a data-labelling firm helmed by 28-year-old Alexandr Wang. Wang is expected to play a critical role in this new chapter by joining Meta and helping establish the core superintelligence lab. Amid this transformation, Meta is aggressively targeting top talent from competitors like OpenAI and Google, offering compensation ranging from seven to nine figures. Some high-profile engineers have already accepted offers, lured by Zuckerberg's direct involvement and the promise of redefining the future of AI. Despite setbacks and internal shakeups in its AI division, Meta continues to integrate its AI capabilities across platforms like Facebook, Instagram, WhatsApp, and even Ray-Ban smart glasses. However, results have often fallen short of expectations. Still, Zuckerberg remains undeterred. As one source noted, he's launched a dedicated WhatsApp chat dubbed 'Recruiting Party' to personally poach top-tier engineers — a testament to how deeply invested he is in Meta's AI renaissance. With 2025 shaping up to be a defining year, Zuckerberg has made it clear: the future of Meta hinges on building superintelligent AI — and he's betting big, one elite recruit at a time.

Google rolls out Android 16 to supported Pixel devices

The Hindu

15 minutes ago

The Hindu

Google rolls out Android 16 to supported Pixel devices

Google announced the rollout of its Android 16 version, with the update first coming to supported Pixel devices before coming to other phone brands later in the year. Google's Seang Chau, VP and GM of Android Platform, noted how Android 16 would continue the company's design concept of Material 3 Expressive, which is meant to offer a more personalised experience to users and let them choose their own aesthetic preferences. A major improvement coming to Android 16 is the forced grouping of notifications that come from one app, to prevent clutter. Meanwhile, live updates will keep users informed in real-time so that they do not have to open a food ordering app or ride hailing app to check their status. Android 16 is further allowing users to switch to using their phone's microphone for clearer calls in noisy environments. Those with hearing aids or devices will also get new controls they can adjust straight from their Android phone. Meanwhile, an Advanced Protection mode that Google calls its 'strongest mobile device protection' will allow eligible users to amp up their security controls for a more secure Android phone experience. 'It enables an array of robust device security features that protect you from online attacks, harmful apps, unsafe websites, scam calls and more. Whether you're a public figure or you just prioritize security, Advanced Protection gives you greater peace of mind that you're protected against the most sophisticated attacks,' noted Chau in the blog post. Google and Samsung have also worked together to introduce desktop windowing for apps across compatible devices. This will roll out later in the year. Custom keyboard shortcuts will allow users to create their own combination of hotkeys and taskbar overflow later in 2025. 'There are many more features to explore with Android 16 — like HDR screenshots, adaptive refresh rate, identity check and others — as well as additional updates coming to Android and Pixel devices today. And later this year, more Material 3 Expressive design updates are coming to Android 16 and Wear OS 6 on Pixel devices,' noted Chau in the company blog post.