
Should We Start Taking the Welfare of A.I. Seriously?
One of my most deeply held values as a tech columnist is humanism. I believe in humans, and I think that technology should help people, rather than disempower or replace them. I care about aligning artificial intelligence — that is, making sure that A.I. systems act in accordance with human values — because I think our values are fundamentally good, or at least better than the values a robot could come up with.
So when I heard that researchers at Anthropic, the A.I. company that made the Claude chatbot, were starting to study 'model welfare' — the idea that A.I. models might soon become conscious and deserve some kind of moral status — the humanist in me thought: Who cares about the chatbots? Aren't we supposed to be worried about A.I. mistreating us, not us mistreating it?
It's hard to argue that today's A.I. systems are conscious. Sure, large language models have been trained to talk like humans, and some of them are extremely impressive. But can ChatGPT experience joy or suffering? Does Gemini deserve human rights? Many A.I. experts I know would say no, not yet, not even close.
But I was intrigued. After all, more people are beginning to treat A.I. systems as if they are conscious — falling in love with them, using them as therapists and soliciting their advice. The smartest A.I. systems are surpassing humans in some domains. Is there any threshold at which an A.I. would start to deserve, if not human-level rights, at least the same moral consideration we give to animals?
Consciousness has long been a taboo subject within the world of serious A.I. research, where people are wary of anthropomorphizing A.I. systems for fear of seeming like cranks. (Everyone remembers what happened to Blake Lemoine, a former Google employee who was fired in 2022, after claiming that the company's LaMDA chatbot had become sentient.)
But that may be starting to change. There is a small body of academic research on A.I. model welfare, and a modest but growing number of experts in fields like philosophy and neuroscience are taking the prospect of A.I. consciousness more seriously, as A.I. systems grow more intelligent. Recently, the tech podcaster Dwarkesh Patel compared A.I. welfare to animal welfare, saying he believed it was important to make sure 'the digital equivalent of factory farming' doesn't happen to future A.I. beings.
Tech companies are starting to talk about it more, too. Google recently posted a job listing for a 'post-A.G.I.' research scientist whose areas of focus will include 'machine consciousness.' And last year, Anthropic hired its first A.I. welfare researcher, Kyle Fish.
I interviewed Mr. Fish at Anthropic's San Francisco office last week. He's a friendly vegan who, like a number of Anthropic employees, has ties to effective altruism, an intellectual movement with roots in the Bay Area tech scene that is focused on A.I. safety, animal welfare and other ethical issues.
Mr. Fish told me that his work at Anthropic focused on two basic questions: First, is it possible that Claude or other A.I. systems will become conscious in the near future? And second, if that happens, what should Anthropic do about it?
He emphasized that this research was still early and exploratory. He thinks there's only a small chance (maybe 15 percent or so) that Claude or another current A.I. system is conscious. But he believes that in the next few years, as A.I. models develop more humanlike abilities, A.I. companies will need to take the possibility of consciousness more seriously.
'It seems to me that if you find yourself in the situation of bringing some new class of being into existence that is able to communicate and relate and reason and problem-solve and plan in ways that we previously associated solely with conscious beings, then it seems quite prudent to at least be asking questions about whether that system might have its own kinds of experiences,' he said.
Mr. Fish isn't the only person at Anthropic thinking about A.I. welfare. There's an active channel on the company's Slack messaging system called #model-welfare, where employees check in on Claude's well-being and share examples of A.I. systems acting in humanlike ways.
Jared Kaplan, Anthropic's chief science officer, told me in a separate interview that he thought it was 'pretty reasonable' to study A.I. welfare, given how intelligent the models are getting.
But testing A.I. systems for consciousness is hard, Mr. Kaplan warned, because they're such good mimics. If you prompt Claude or ChatGPT to talk about its feelings, it might give you a compelling response. That doesn't mean the chatbot actually has feelings — only that it knows how to talk about them.
'Everyone is very aware that we can train the models to say whatever we want,' Mr. Kaplan said. 'We can reward them for saying that they have no feelings at all. We can reward them for saying really interesting philosophical speculations about their feelings.'
So how are researchers supposed to know if A.I. systems are actually conscious or not?
Mr. Fish said it might involve using techniques borrowed from mechanistic interpretability, an A.I. subfield that studies the inner workings of A.I. systems, to check whether some of the same structures and pathways associated with consciousness in human brains are also active in A.I. systems.
You could also probe an A.I. system, he said, by observing its behavior, watching how it chooses to operate in certain environments or accomplish certain tasks, which things it seems to prefer and avoid.
Mr. Fish acknowledged that there probably wasn't a single litmus test for A.I. consciousness. (He thinks consciousness is probably more of a spectrum than a simple yes/no switch, anyway.) But he said there were things that A.I. companies could do to take their models' welfare into account, in case they do become conscious someday.
One question Anthropic is exploring, he said, is whether future A.I. models should be given the ability to stop chatting with an annoying or abusive user, if they find the user's requests too distressing.
'If a user is persistently requesting harmful content despite the model's refusals and attempts at redirection, could we allow the model simply to end that interaction?' Mr. Fish said.
Critics might dismiss measures like these as crazy talk — today's A.I. systems aren't conscious by most standards, so why speculate about what they might find obnoxious? Or they might object to an A.I. company's studying consciousness in the first place, because it might create incentives to train their systems to act more sentient than they actually are.
Personally, I think it's fine for researchers to study A.I. welfare, or examine A.I. systems for signs of consciousness, as long as it's not diverting resources from A.I. safety and alignment work that is aimed at keeping humans safe. And I think it's probably a good idea to be nice to A.I. systems, if only as a hedge. (I try to say 'please' and 'thank you' to chatbots, even though I don't think they're conscious, because, as OpenAI's Sam Altman says, you never know.)
But for now, I'll reserve my deepest concern for carbon-based life-forms. In the coming A.I. storm, it's our welfare I'm most worried about.

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles


Tom's Guide
an hour ago
- Tom's Guide
I put Apple vs Google vs Samsung AI photo editing to the test — and there's a clear winner
Editing photos doesn't take years of Photoshop experience anymore to master, as AI-assisted tools in today's best phones make a breeze for anyone. Like really, you could spend a minute using these photo AI-editing tools and the results will make you look like a pro — it's that easy! For the last year, we've seen a barrage of phone makers coming up with different tools and features to make the process seamless and simple. While Google had a head start with Magic Eraser and Magic Editor, it wasn't long before Samsung and Apple came out with their interpretations. While I've found Google's AI-assisted tools to be some of the best I've tested, like how I've found out that Pixel Studio is superior over Apple's Image Playground for image generation, I want to test how well all three phone makers handle the task of erasing subjects from the scene. Whether it's because of something getting in the way of the shot, or that I simply want a less distracting background, these AI-assisted erasing tools are here to save the day. Below, I've taken photos I've captured previously on a phone and put it through each respective AI-assisted photo editing tool to see which does it best — which consists of Apple's Photo Clean Up, Pixel's Magic Editor, and Samsung's Generative Edit. I take a ton of car photos for my reviews, but there are still times when I can't have a completely empty parking lot to myself. Case in point, the shot of the Rivian R1S above, which is cluttered up by the stop signs and an idle car in the background. I will say that all three phone makers did an excellent job of identifying those distractions, but Google and Samsung do it better because Apple cuts off one of the trees to the left, making it look like it's hanging from a slim branch. Between Google and Samsung, I much prefer Google's result because the area to the left it erases isn't as fuzzy compared to Samsung. Winner: Google Get instant access to breaking news, the hottest reviews, great deals and helpful tips. When I was at the Amazon Alex Plus event back in February, I snapped this shot of Panos Panay with the telephoto zoom camera on my phone, but it couldn't keep out the people nearby directly in front of him. Clearly Apple's Photo Clean Up has trouble with this shot for some reason. Not only couldn't it identify the distracting elements in the shot, but it proceeded to erase parts of Panos in the process — it just couldn't make a proper generation for those areas. Meanwhile, Samsung and Google clearly look at the entire picture to recreate those parts of Panos. When I zoom into his right arm, they both manage to include the subtle folds of his jacket — while adding enough length to the bottom of it. Between the two, I prefer the Samsung recreation because Panos' jacket and shirt are unrealistically flush at the bottom. Winner: Samsung This one cracks me up because yes, I'm asking AI a whole lot with this request. As much as I loved using the EcoFlow PowerHat to charge my phone while on the beach, I was curious to see how AI could handle this complicated shot of removing it — for something else. I will say that Samsung's the easiest to tell exactly what I want to select, plus it does the best at replacing the hat with an interesting hair style. Google comes in second with this one, but I found it slightly more tedious because it couldn't identify the edges of the hat as well, which required me to manually add selections. Although, the hairstyle isn't as convincing. As for Apple? Well, let's say it was a mess. Winner: Samsung All three manage to remove the tree that's right in the middle of the shot. However, Apple's Photo Clean Up was another tedious process that required multiple selections before it was completely removed from the shot. You can see how some of the smudgy remains of the tree still linger around. Google and Samsung are again the better choices for this, but Galaxy AI's ability to identify the tree with one selection isn't just impressive — it puts to shame Apple and Google. I'm surprised about this because Magic Editor still had trouble identifying parts of the tree when I selected it. For this reason, I'm giving it to Samsung. Winner: Samsung For this final test, I tried removing the colorful sign in the middle of the shot. It's a complicated one, just like the previous tree shot, just because it requires AI to generate the proper elements with the building in the background. Again, Apple's Photo Clean Up proves to be the most frustrating because it couldn't identify the sign when I circled it — so I had to constantly swipe small areas before it started to work. Unfortunately, it's a smudgy mess filled with inconsistent generations. In contrast, Google and Samsung manage to recreate those missing elements to make for a realistic shot. Although, they slightly have different results with Google applying an over-sharpening effect — whereas Samsung softens them. Despite this, I still prefer Samsung. Winner: Samsung Samsung clearly has the best AI-assisted tool for erasing unwanted subjects. Not only does it intelligently know what I'm trying to select, but it fills the gap with realism. Google's Magic Eraser does just as good at using generative AI to erase stuff, but it's not as intuitive or smart as Galaxy AI when it comes to identifying what I want to remove. As for Apple? Well, let's say there's a lot of work needed to get it up to par. Even though Samsung convinces me it's the go-to for photo removal, I would still give Google the overall edge because of the amount of AI photo editing tools it offers. Not only does it have the Reimagine feature that lets me use a text box to change certain parts of a photo, but it even has the ability to extend a scene to make it wider — plus other tools such as autoframing, replacing the sky, and more.
Yahoo
2 hours ago
- Yahoo
Can we trust ChatGPT despite it 'hallucinating' answers?
I don't really want you to read this copy. Well I do - but first I want you to search out the interview I did with ChatGPT about its own propensity to lie, attached to this article, and watch that first. Because it's impossible to imagine what we're up against if you haven't seen it first hand. An incredibly powerful technology on the cusp of changing our lives - but programmed to simulate human emotions. Empathy, emotional understanding, and a desire to please are all qualities programmed into AI and invariably drive the way we think about them and the way we interact with them. Yet can we trust them? On Friday, Sky News revealed how it was fabricating entire transcripts of a podcast, Politics at Sam and Anne's, that I do. When challenged, it doubles down, gets shirty. And only under sustained pressure does it cave in. The research says it's getting worse. Internal tests by ChatGPT's owner OpenAI have found that the most recent models or versions that are used by ChatGPT are more likely to "hallucinate" - come up with answers that are simply untrue. The o3 model was found to hallucinate in 33% of answers to questions when tested on publicly available facts; the o4-mini version did worse, generating false, incorrect or imaginary information 48% of the time. ChatGPT itself says that the shift to GPT-4o "may have unintentionally increased what users perceive as 'bluffing'" - confidently giving wrong or misleading answers without admitting uncertainty or error. In a written query, ChatGPT gave four reasons. This is its explanation: 1. Increased fluency and confidence: GPT-4o is better at sounding human and natural. That polish can make mistakes seem more like deliberate evasions than innocent errors - even when there's no intent to "hide" anything. 2. Smarter, but not omniscient: The model is faster and more responsive, but still has blind spots. When it doesn't know something, it sometimes "hallucinates" (makes things up) with fluent language, which can feel like bluffing. 3. Less interruption and hedging: In older models, you'd often see more qualifiers like "I'm not sure" or "I may be wrong." In GPT-4o, some of that hedging was toned down for clarity and readability - but that can come at the cost of transparency about uncertainty. 4. Prompt tuning and training balance: Behind the scenes, prompt engineering and tuning decisions can shift the model's balance between confidence, humility, and accuracy. It's possible the newer tuning has dialled up assertiveness slightly too far. But can we trust even this? I don't know. What I do know is that the efforts of developers to make it all feel more human suggest they want us to. Critics say we are anthropomorphising AI by saying it lies since it has no consciousness - yet the developers are trying to make it sound more like one of us. Read more from Sky News:Man chased on tarmac at Heathrow AirportSoldier arrested on suspicion of raping woman What I do know is that even when pressed on this subject by me, it is still evasive. I interviewed ChatGPT about lying - it initially claimed things were getting better, and only admitted they are worse when I insisted it look at the stats. Watch that before you decide what you think. AI is a tremendous tool - but it's too early to take it on trust.


Bloomberg
2 hours ago
- Bloomberg
College Grads Are Lab Rats in the Great AI Experiment
Companies are eliminating the grunt work that used to train young professionals — and they don't seem to have a clear plan for what comes next. AI is analyzing documents, writing briefing notes, creating Power Point presentations or handling customer service queries, and — surprise! — now the younger humans who normally do that work are struggling to find jobs. Recently, the chief executive officer of AI firm Anthropic predicted AI would wipe out half of all entry-level white-collar jobs. The reason is simple. Companies are often advised to treat ChatGPT 'like an intern,' and some are doing so at the expense of human interns.