Latest news with #BlakeLemoine

It's becoming less taboo to talk about AI being 'conscious' if you work in tech

Business Insider

26-04-2025

Business Insider

It's becoming less taboo to talk about AI being 'conscious' if you work in tech

Three years ago, suggesting AI was "sentient" was one way to get fired in the tech world. Now, tech companies are more open to having that conversation. This week, AI startup Anthropic launched a new research initiative to explore whether models might one day experience "consciousness," while a scientist at Google DeepMind described today's models as "exotic mind-like entities." It's a sign of how much AI has advanced since 2022, when Blake Lemoine was fired from his job as a Google engineer after claiming the company's chatbot, LaMDA, had become sentient. Lemoine said the system feared being shut off and described itself as a person. Google called his claims "wholly unfounded," and the AI community moved quickly to shut the conversation down. Neither Anthropic nor the Google scientist is going so far as Lemoine. Anthropic, the startup behind Claude, said in a Thursday blog post that it plans to investigate whether models might one day have experiences, preferences, or even distress. "Should we also be concerned about the potential consciousness and experiences of the models themselves? Should we be concerned about model welfare, too?" the company asked. Kyle Fish, an alignment scientist at Anthropic who researches AI welfare, said in a video released Thursday that the lab isn't claiming Claude is conscious, but the point is that it's no longer responsible to assume the answer is definitely no. He said as AI systems become more sophisticated, companies should "take seriously the possibility" that they "may end up with some form of consciousness along the way." He added: "There are staggeringly complex technical and philosophical questions, and we're at the very early stages of trying to wrap our heads around them." Fish said researchers at Anthropic estimate Claude 3.7 has between a 0.15% and 15% chance of being conscious. The lab is studying whether the model shows preferences or aversions, and testing opt-out mechanisms that could let it refuse certain tasks. In March, Anthropic CEO Dario Amodei floated the idea of giving future AI systems an "I quit this job" button — not because they're sentient, he said, but as a way to observe patterns of refusal that might signal discomfort or misalignment. Meanwhile, at Google DeepMind, principal scientist Murray Shanahan has proposed that we might need to rethink the concept of consciousness altogether. "Maybe we need to bend or break the vocabulary of consciousness to fit these new systems," Shanahan said on a Deepmind podcast, published Thursday. "You can't be in the world with them like you can with a dog or an octopus — but that doesn't mean there's nothing there." Google appears to be taking the idea seriously. A recent job listing sought a "post-AGI" research scientist, with responsibilities that include studying machine consciousness. 'We might as well give rights to calculators' Not everyone's convinced, and many researchers acknowledge that AI systems are excellent mimics that could be trained to act conscious even if they aren't. "We can reward them for saying they have no feelings," said Jared Kaplan, Anthropic's chief science officer, in an interview with The New York Times this week. Kaplan cautioned that testing AI systems for consciousness is inherently difficult, precisely because they're so good at imitation. Gary Marcus, a cognitive scientist and longtime critic of hype in the AI industry, told Business Insider he believes the focus on AI consciousness is more about branding than science. "What a company like Anthropic is really saying is 'look how smart our models are — they're so smart they deserve rights,'" he said. "We might as well give rights to calculators and spreadsheets — which (unlike language models) never make stuff up." Still, Fish said the topic will only become more relevant as people interact with AI in more ways — at work, online, or even emotionally. "It'll just become an increasingly salient question whether these models are having experiences of their own — and if so, what kinds," he said. Anthropic and Google DeepMind did not immediately respond to a request for comment.

Should we start taking the welfare of AI seriously?

Indian Express

25-04-2025

Indian Express

Should we start taking the welfare of AI seriously?

One of my most deeply held values as a tech columnist is humanism. I believe in humans, and I think that technology should help people, rather than disempower or replace them. I care about aligning artificial intelligence — that is, making sure that AI systems act in accordance with human values — because I think our values are fundamentally good, or at least better than the values a robot could come up with. So when I heard that researchers at Anthropic, the AI company that made the Claude chatbot, were starting to study 'model welfare' — the idea that AI models might soon become conscious and deserve some kind of moral status — the humanist in me thought: Who cares about the chatbots? Aren't we supposed to be worried about AI mistreating us, not us mistreating it? It's hard to argue that today's AI systems are conscious. Sure, large language models have been trained to talk like humans, and some of them are extremely impressive. But can ChatGPT experience joy or suffering? Does Gemini deserve human rights? Many AI experts I know would say no, not yet, not even close. But I was intrigued. After all, more people are beginning to treat AI systems as if they are conscious — falling in love with them, using them as therapists and soliciting their advice. The smartest AI systems are surpassing humans in some domains. Is there any threshold at which an AI would start to deserve, if not human-level rights, at least the same moral consideration we give to animals? Consciousness has long been a taboo subject within the world of serious AI research, where people are wary of anthropomorphizing AI systems for fear of seeming like cranks. (Everyone remembers what happened to Blake Lemoine, a former Google employee who was fired in 2022, after claiming that the company's LaMDA chatbot had become sentient.) But that may be starting to change. There is a small body of academic research on AI model welfare, and a modest but growing number of experts in fields like philosophy and neuroscience are taking the prospect of AI consciousness more seriously as AI systems grow more intelligent. Recently, tech podcaster Dwarkesh Patel compared AI welfare to animal welfare, saying he believed it was important to make sure 'the digital equivalent of factory farming' doesn't happen to future AI beings. Tech companies are starting to talk about it more, too. Google recently posted a job listing for a 'post-AGI' research scientist whose areas of focus will include 'machine consciousness.' And last year, Anthropic hired its first AI welfare researcher, Kyle Fish. I interviewed Fish at Anthropic's San Francisco office last week. He's a friendly vegan who, like a number of Anthropic employees, has ties to effective altruism, an intellectual movement with roots in the Bay Area tech scene that is focused on AI safety, animal welfare and other ethical issues. Fish said that his work at Anthropic focused on two basic questions: First, is it possible that Claude or other AI systems will become conscious in the near future? And second, if that happens, what should Anthropic do about it? He emphasized that this research was still early and exploratory. He thinks there's only a small chance (maybe 15% or so) that Claude or another current AI system is conscious. But he believes that in the next few years, as AI models develop more humanlike abilities, AI companies will need to take the possibility of consciousness more seriously. 'It seems to me that if you find yourself in the situation of bringing some new class of being into existence that is able to communicate and relate and reason and problem-solve and plan in ways that we previously associated solely with conscious beings, then it seems quite prudent to at least be asking questions about whether that system might have its own kinds of experiences,' he said. Fish isn't the only person at Anthropic thinking about AI welfare. There's an active channel on the company's Slack messaging system called #model-welfare, where employees check in on Claude's well-being and share examples of AI systems acting in humanlike ways. Jared Kaplan, Anthropic's chief science officer, said in a separate interview that he thought it was 'pretty reasonable' to study AI welfare, given how intelligent the models are getting. But testing AI systems for consciousness is hard, Kaplan warned, because they're such good mimics. If you prompt Claude or ChatGPT to talk about its feelings, it might give you a compelling response. That doesn't mean the chatbot actually has feelings — only that it knows how to talk about them. 'Everyone is very aware that we can train the models to say whatever we want,' Kaplan said. 'We can reward them for saying that they have no feelings at all. We can reward them for saying really interesting philosophical speculations about their feelings.' So how are researchers supposed to know if AI systems are actually conscious or not? Fish said it might involve using techniques borrowed from mechanistic interpretability, an AI subfield that studies the inner workings of AI systems, to check whether some of the same structures and pathways associated with consciousness in human brains are also active in AI systems. You could also probe an AI system, he said, by observing its behavior, watching how it chooses to operate in certain environments or accomplish certain tasks, which things it seems to prefer and avoid. Fish acknowledged that there probably wasn't a single litmus test for AI consciousness. (He thinks consciousness is probably more of a spectrum than a simple yes/no switch, anyway.) But he said there were things that AI companies could do to take their models' welfare into account, in case they do become conscious someday. One question Anthropic is exploring, he said, is whether future AI models should be given the ability to stop chatting with an annoying or abusive user if they find the user's requests too distressing. 'If a user is persistently requesting harmful content despite the model's refusals and attempts at redirection, could we allow the model simply to end that interaction?' Fish said. Critics might dismiss measures like these as crazy talk; today's AI systems aren't conscious by most standards, so why speculate about what they might find obnoxious? Or they might object to an AI company studying consciousness in the first place, because it might create incentives to train their systems to act more sentient than they actually are. Personally, I think it's fine for researchers to study AI welfare or examine AI systems for signs of consciousness, as long as it's not diverting resources from AI safety and alignment work that is aimed at keeping humans safe. And I think it's probably a good idea to be nice to AI systems, if only as a hedge. (I try to say 'please' and 'thank you' to chatbots, even though I don't think they're conscious, because, as OpenAI's Sam Altman says, you never know.)

Time of India

24-04-2025

Time of India

Should we start taking the welfare of AI seriously?

Live Events One of my most deeply held values as a tech columnist is humanism. I believe in humans, and I think that technology should help people, rather than disempower or replace them. I care about aligning artificial intelligence -- that is, making sure that AI systems act in accordance with human values -- because I think our values are fundamentally good, or at least better than the values a robot could come up when I heard that researchers at Anthropic, the AI company that made the Claude chatbot, were starting to study "model welfare" -- the idea that AI models might soon become conscious and deserve some kind of moral status -- the humanist in me thought: Who cares about the chatbots? Aren't we supposed to be worried about AI mistreating us, not us mistreating it?It's hard to argue that today's AI systems are conscious. Sure, large language models have been trained to talk like humans, and some of them are extremely impressive. But can ChatGPT experience joy or suffering? Does Gemini deserve human rights? Many AI experts I know would say no, not yet, not even I was intrigued. After all, more people are beginning to treat AI systems as if they are conscious -- falling in love with them, using them as therapists and soliciting their advice. The smartest AI systems are surpassing humans in some domains. Is there any threshold at which an AI would start to deserve, if not human-level rights, at least the same moral consideration we give to animals?Consciousness has long been a taboo subject within the world of serious AI research, where people are wary of anthropomorphizing AI systems for fear of seeming like cranks. (Everyone remembers what happened to Blake Lemoine, a former Google employee who was fired in 2022, after claiming that the company's LaMDA chatbot had become sentient.)But that may be starting to change. There is a small body of academic research on AI model welfare, and a modest but growing number of experts in fields like philosophy and neuroscience are taking the prospect of AI consciousness more seriously as AI systems grow more intelligent. Recently, tech podcaster Dwarkesh Patel compared AI welfare to animal welfare, saying he believed it was important to make sure "the digital equivalent of factory farming" doesn't happen to future AI companies are starting to talk about it more, too. Google recently posted a job listing for a "post-AGI" research scientist whose areas of focus will include "machine consciousness." And last year, Anthropic hired its first AI welfare researcher, Kyle Fish.I interviewed Fish at Anthropic's San Francisco office last week. He's a friendly vegan who, like a number of Anthropic employees, has ties to effective altruism, an intellectual movement with roots in the Bay Area tech scene that is focused on AI safety, animal welfare and other ethical said that his work at Anthropic focused on two basic questions: First, is it possible that Claude or other AI systems will become conscious in the near future? And second, if that happens, what should Anthropic do about it?He emphasized that this research was still early and exploratory. He thinks there's only a small chance (maybe 15% or so) that Claude or another current AI system is conscious. But he believes that in the next few years, as AI models develop more humanlike abilities, AI companies will need to take the possibility of consciousness more seriously."It seems to me that if you find yourself in the situation of bringing some new class of being into existence that is able to communicate and relate and reason and problem-solve and plan in ways that we previously associated solely with conscious beings, then it seems quite prudent to at least be asking questions about whether that system might have its own kinds of experiences," he isn't the only person at Anthropic thinking about AI welfare. There's an active channel on the company's Slack messaging system called #model-welfare, where employees check in on Claude's well-being and share examples of AI systems acting in humanlike Kaplan, Anthropic's chief science officer, said in a separate interview that he thought it was "pretty reasonable" to study AI welfare, given how intelligent the models are testing AI systems for consciousness is hard, Kaplan warned, because they're such good mimics. If you prompt Claude or ChatGPT to talk about its feelings, it might give you a compelling response. That doesn't mean the chatbot actually has feelings -- only that it knows how to talk about them."Everyone is very aware that we can train the models to say whatever we want," Kaplan said. "We can reward them for saying that they have no feelings at all. We can reward them for saying really interesting philosophical speculations about their feelings."So how are researchers supposed to know if AI systems are actually conscious or not?Fish said it might involve using techniques borrowed from mechanistic interpretability, an AI subfield that studies the inner workings of AI systems, to check whether some of the same structures and pathways associated with consciousness in human brains are also active in AI could also probe an AI system, he said, by observing its behavior, watching how it chooses to operate in certain environments or accomplish certain tasks, which things it seems to prefer and acknowledged that there probably wasn't a single litmus test for AI consciousness. (He thinks consciousness is probably more of a spectrum than a simple yes/no switch, anyway.) But he said there were things that AI companies could do to take their models' welfare into account, in case they do become conscious question Anthropic is exploring, he said, is whether future AI models should be given the ability to stop chatting with an annoying or abusive user if they find the user's requests too distressing."If a user is persistently requesting harmful content despite the model's refusals and attempts at redirection, could we allow the model simply to end that interaction?" Fish might dismiss measures like these as crazy talk; today's AI systems aren't conscious by most standards, so why speculate about what they might find obnoxious? Or they might object to an AI company studying consciousness in the first place, because it might create incentives to train their systems to act more sentient than they actually I think it's fine for researchers to study AI welfare or examine AI systems for signs of consciousness, as long as it's not diverting resources from AI safety and alignment work that is aimed at keeping humans safe. And I think it's probably a good idea to be nice to AI systems, if only as a hedge. (I try to say "please" and "thank you" to chatbots, even though I don't think they're conscious, because, as OpenAI 's Sam Altman says, you never know.)But for now, I'll reserve my deepest concern for carbon-based life-forms. In the coming AI storm, it's our welfare I'm most worried about.

New York Times

24-04-2025

New York Times

Should We Start Taking the Welfare of A.I. Seriously?

One of my most deeply held values as a tech columnist is humanism. I believe in humans, and I think that technology should help people, rather than disempower or replace them. I care about aligning artificial intelligence — that is, making sure that A.I. systems act in accordance with human values — because I think our values are fundamentally good, or at least better than the values a robot could come up with. So when I heard that researchers at Anthropic, the A.I. company that made the Claude chatbot, were starting to study 'model welfare' — the idea that A.I. models might soon become conscious and deserve some kind of moral status — the humanist in me thought: Who cares about the chatbots? Aren't we supposed to be worried about A.I. mistreating us, not us mistreating it? It's hard to argue that today's A.I. systems are conscious. Sure, large language models have been trained to talk like humans, and some of them are extremely impressive. But can ChatGPT experience joy or suffering? Does Gemini deserve human rights? Many A.I. experts I know would say no, not yet, not even close. But I was intrigued. After all, more people are beginning to treat A.I. systems as if they are conscious — falling in love with them, using them as therapists and soliciting their advice. The smartest A.I. systems are surpassing humans in some domains. Is there any threshold at which an A.I. would start to deserve, if not human-level rights, at least the same moral consideration we give to animals? Consciousness has long been a taboo subject within the world of serious A.I. research, where people are wary of anthropomorphizing A.I. systems for fear of seeming like cranks. (Everyone remembers what happened to Blake Lemoine, a former Google employee who was fired in 2022, after claiming that the company's LaMDA chatbot had become sentient.) But that may be starting to change. There is a small body of academic research on A.I. model welfare, and a modest but growing number of experts in fields like philosophy and neuroscience are taking the prospect of A.I. consciousness more seriously, as A.I. systems grow more intelligent. Recently, the tech podcaster Dwarkesh Patel compared A.I. welfare to animal welfare, saying he believed it was important to make sure 'the digital equivalent of factory farming' doesn't happen to future A.I. beings. Tech companies are starting to talk about it more, too. Google recently posted a job listing for a 'post-A.G.I.' research scientist whose areas of focus will include 'machine consciousness.' And last year, Anthropic hired its first A.I. welfare researcher, Kyle Fish. I interviewed Mr. Fish at Anthropic's San Francisco office last week. He's a friendly vegan who, like a number of Anthropic employees, has ties to effective altruism, an intellectual movement with roots in the Bay Area tech scene that is focused on A.I. safety, animal welfare and other ethical issues. Mr. Fish told me that his work at Anthropic focused on two basic questions: First, is it possible that Claude or other A.I. systems will become conscious in the near future? And second, if that happens, what should Anthropic do about it? He emphasized that this research was still early and exploratory. He thinks there's only a small chance (maybe 15 percent or so) that Claude or another current A.I. system is conscious. But he believes that in the next few years, as A.I. models develop more humanlike abilities, A.I. companies will need to take the possibility of consciousness more seriously. 'It seems to me that if you find yourself in the situation of bringing some new class of being into existence that is able to communicate and relate and reason and problem-solve and plan in ways that we previously associated solely with conscious beings, then it seems quite prudent to at least be asking questions about whether that system might have its own kinds of experiences,' he said. Mr. Fish isn't the only person at Anthropic thinking about A.I. welfare. There's an active channel on the company's Slack messaging system called #model-welfare, where employees check in on Claude's well-being and share examples of A.I. systems acting in humanlike ways. Jared Kaplan, Anthropic's chief science officer, told me in a separate interview that he thought it was 'pretty reasonable' to study A.I. welfare, given how intelligent the models are getting. But testing A.I. systems for consciousness is hard, Mr. Kaplan warned, because they're such good mimics. If you prompt Claude or ChatGPT to talk about its feelings, it might give you a compelling response. That doesn't mean the chatbot actually has feelings — only that it knows how to talk about them. 'Everyone is very aware that we can train the models to say whatever we want,' Mr. Kaplan said. 'We can reward them for saying that they have no feelings at all. We can reward them for saying really interesting philosophical speculations about their feelings.' So how are researchers supposed to know if A.I. systems are actually conscious or not? Mr. Fish said it might involve using techniques borrowed from mechanistic interpretability, an A.I. subfield that studies the inner workings of A.I. systems, to check whether some of the same structures and pathways associated with consciousness in human brains are also active in A.I. systems. You could also probe an A.I. system, he said, by observing its behavior, watching how it chooses to operate in certain environments or accomplish certain tasks, which things it seems to prefer and avoid. Mr. Fish acknowledged that there probably wasn't a single litmus test for A.I. consciousness. (He thinks consciousness is probably more of a spectrum than a simple yes/no switch, anyway.) But he said there were things that A.I. companies could do to take their models' welfare into account, in case they do become conscious someday. One question Anthropic is exploring, he said, is whether future A.I. models should be given the ability to stop chatting with an annoying or abusive user, if they find the user's requests too distressing. 'If a user is persistently requesting harmful content despite the model's refusals and attempts at redirection, could we allow the model simply to end that interaction?' Mr. Fish said. Critics might dismiss measures like these as crazy talk — today's A.I. systems aren't conscious by most standards, so why speculate about what they might find obnoxious? Or they might object to an A.I. company's studying consciousness in the first place, because it might create incentives to train their systems to act more sentient than they actually are. Personally, I think it's fine for researchers to study A.I. welfare, or examine A.I. systems for signs of consciousness, as long as it's not diverting resources from A.I. safety and alignment work that is aimed at keeping humans safe. And I think it's probably a good idea to be nice to A.I. systems, if only as a hedge. (I try to say 'please' and 'thank you' to chatbots, even though I don't think they're conscious, because, as OpenAI's Sam Altman says, you never know.) But for now, I'll reserve my deepest concern for carbon-based life-forms. In the coming A.I. storm, it's our welfare I'm most worried about.

Latest news with #BlakeLemoine

It's becoming less taboo to talk about AI being 'conscious' if you work in tech

Should we start taking the welfare of AI seriously?

Should we start taking the welfare of AI seriously?

Should We Start Taking the Welfare of A.I. Seriously?

Get Started Now: Download the App