Latest news with #LaMDA

It's becoming less taboo to talk about AI being 'conscious' if you work in tech

Business Insider

26-04-2025

Business Insider

It's becoming less taboo to talk about AI being 'conscious' if you work in tech

Three years ago, suggesting AI was "sentient" was one way to get fired in the tech world. Now, tech companies are more open to having that conversation. This week, AI startup Anthropic launched a new research initiative to explore whether models might one day experience "consciousness," while a scientist at Google DeepMind described today's models as "exotic mind-like entities." It's a sign of how much AI has advanced since 2022, when Blake Lemoine was fired from his job as a Google engineer after claiming the company's chatbot, LaMDA, had become sentient. Lemoine said the system feared being shut off and described itself as a person. Google called his claims "wholly unfounded," and the AI community moved quickly to shut the conversation down. Neither Anthropic nor the Google scientist is going so far as Lemoine. Anthropic, the startup behind Claude, said in a Thursday blog post that it plans to investigate whether models might one day have experiences, preferences, or even distress. "Should we also be concerned about the potential consciousness and experiences of the models themselves? Should we be concerned about model welfare, too?" the company asked. Kyle Fish, an alignment scientist at Anthropic who researches AI welfare, said in a video released Thursday that the lab isn't claiming Claude is conscious, but the point is that it's no longer responsible to assume the answer is definitely no. He said as AI systems become more sophisticated, companies should "take seriously the possibility" that they "may end up with some form of consciousness along the way." He added: "There are staggeringly complex technical and philosophical questions, and we're at the very early stages of trying to wrap our heads around them." Fish said researchers at Anthropic estimate Claude 3.7 has between a 0.15% and 15% chance of being conscious. The lab is studying whether the model shows preferences or aversions, and testing opt-out mechanisms that could let it refuse certain tasks. In March, Anthropic CEO Dario Amodei floated the idea of giving future AI systems an "I quit this job" button — not because they're sentient, he said, but as a way to observe patterns of refusal that might signal discomfort or misalignment. Meanwhile, at Google DeepMind, principal scientist Murray Shanahan has proposed that we might need to rethink the concept of consciousness altogether. "Maybe we need to bend or break the vocabulary of consciousness to fit these new systems," Shanahan said on a Deepmind podcast, published Thursday. "You can't be in the world with them like you can with a dog or an octopus — but that doesn't mean there's nothing there." Google appears to be taking the idea seriously. A recent job listing sought a "post-AGI" research scientist, with responsibilities that include studying machine consciousness. 'We might as well give rights to calculators' Not everyone's convinced, and many researchers acknowledge that AI systems are excellent mimics that could be trained to act conscious even if they aren't. "We can reward them for saying they have no feelings," said Jared Kaplan, Anthropic's chief science officer, in an interview with The New York Times this week. Kaplan cautioned that testing AI systems for consciousness is inherently difficult, precisely because they're so good at imitation. Gary Marcus, a cognitive scientist and longtime critic of hype in the AI industry, told Business Insider he believes the focus on AI consciousness is more about branding than science. "What a company like Anthropic is really saying is 'look how smart our models are — they're so smart they deserve rights,'" he said. "We might as well give rights to calculators and spreadsheets — which (unlike language models) never make stuff up." Still, Fish said the topic will only become more relevant as people interact with AI in more ways — at work, online, or even emotionally. "It'll just become an increasingly salient question whether these models are having experiences of their own — and if so, what kinds," he said. Anthropic and Google DeepMind did not immediately respond to a request for comment.

Should we start taking the welfare of AI seriously?

Indian Express

25-04-2025

Indian Express

Should we start taking the welfare of AI seriously?

One of my most deeply held values as a tech columnist is humanism. I believe in humans, and I think that technology should help people, rather than disempower or replace them. I care about aligning artificial intelligence — that is, making sure that AI systems act in accordance with human values — because I think our values are fundamentally good, or at least better than the values a robot could come up with. So when I heard that researchers at Anthropic, the AI company that made the Claude chatbot, were starting to study 'model welfare' — the idea that AI models might soon become conscious and deserve some kind of moral status — the humanist in me thought: Who cares about the chatbots? Aren't we supposed to be worried about AI mistreating us, not us mistreating it? It's hard to argue that today's AI systems are conscious. Sure, large language models have been trained to talk like humans, and some of them are extremely impressive. But can ChatGPT experience joy or suffering? Does Gemini deserve human rights? Many AI experts I know would say no, not yet, not even close. But I was intrigued. After all, more people are beginning to treat AI systems as if they are conscious — falling in love with them, using them as therapists and soliciting their advice. The smartest AI systems are surpassing humans in some domains. Is there any threshold at which an AI would start to deserve, if not human-level rights, at least the same moral consideration we give to animals? Consciousness has long been a taboo subject within the world of serious AI research, where people are wary of anthropomorphizing AI systems for fear of seeming like cranks. (Everyone remembers what happened to Blake Lemoine, a former Google employee who was fired in 2022, after claiming that the company's LaMDA chatbot had become sentient.) But that may be starting to change. There is a small body of academic research on AI model welfare, and a modest but growing number of experts in fields like philosophy and neuroscience are taking the prospect of AI consciousness more seriously as AI systems grow more intelligent. Recently, tech podcaster Dwarkesh Patel compared AI welfare to animal welfare, saying he believed it was important to make sure 'the digital equivalent of factory farming' doesn't happen to future AI beings. Tech companies are starting to talk about it more, too. Google recently posted a job listing for a 'post-AGI' research scientist whose areas of focus will include 'machine consciousness.' And last year, Anthropic hired its first AI welfare researcher, Kyle Fish. I interviewed Fish at Anthropic's San Francisco office last week. He's a friendly vegan who, like a number of Anthropic employees, has ties to effective altruism, an intellectual movement with roots in the Bay Area tech scene that is focused on AI safety, animal welfare and other ethical issues. Fish said that his work at Anthropic focused on two basic questions: First, is it possible that Claude or other AI systems will become conscious in the near future? And second, if that happens, what should Anthropic do about it? He emphasized that this research was still early and exploratory. He thinks there's only a small chance (maybe 15% or so) that Claude or another current AI system is conscious. But he believes that in the next few years, as AI models develop more humanlike abilities, AI companies will need to take the possibility of consciousness more seriously. 'It seems to me that if you find yourself in the situation of bringing some new class of being into existence that is able to communicate and relate and reason and problem-solve and plan in ways that we previously associated solely with conscious beings, then it seems quite prudent to at least be asking questions about whether that system might have its own kinds of experiences,' he said. Fish isn't the only person at Anthropic thinking about AI welfare. There's an active channel on the company's Slack messaging system called #model-welfare, where employees check in on Claude's well-being and share examples of AI systems acting in humanlike ways. Jared Kaplan, Anthropic's chief science officer, said in a separate interview that he thought it was 'pretty reasonable' to study AI welfare, given how intelligent the models are getting. But testing AI systems for consciousness is hard, Kaplan warned, because they're such good mimics. If you prompt Claude or ChatGPT to talk about its feelings, it might give you a compelling response. That doesn't mean the chatbot actually has feelings — only that it knows how to talk about them. 'Everyone is very aware that we can train the models to say whatever we want,' Kaplan said. 'We can reward them for saying that they have no feelings at all. We can reward them for saying really interesting philosophical speculations about their feelings.' So how are researchers supposed to know if AI systems are actually conscious or not? Fish said it might involve using techniques borrowed from mechanistic interpretability, an AI subfield that studies the inner workings of AI systems, to check whether some of the same structures and pathways associated with consciousness in human brains are also active in AI systems. You could also probe an AI system, he said, by observing its behavior, watching how it chooses to operate in certain environments or accomplish certain tasks, which things it seems to prefer and avoid. Fish acknowledged that there probably wasn't a single litmus test for AI consciousness. (He thinks consciousness is probably more of a spectrum than a simple yes/no switch, anyway.) But he said there were things that AI companies could do to take their models' welfare into account, in case they do become conscious someday. One question Anthropic is exploring, he said, is whether future AI models should be given the ability to stop chatting with an annoying or abusive user if they find the user's requests too distressing. 'If a user is persistently requesting harmful content despite the model's refusals and attempts at redirection, could we allow the model simply to end that interaction?' Fish said. Critics might dismiss measures like these as crazy talk; today's AI systems aren't conscious by most standards, so why speculate about what they might find obnoxious? Or they might object to an AI company studying consciousness in the first place, because it might create incentives to train their systems to act more sentient than they actually are. Personally, I think it's fine for researchers to study AI welfare or examine AI systems for signs of consciousness, as long as it's not diverting resources from AI safety and alignment work that is aimed at keeping humans safe. And I think it's probably a good idea to be nice to AI systems, if only as a hedge. (I try to say 'please' and 'thank you' to chatbots, even though I don't think they're conscious, because, as OpenAI's Sam Altman says, you never know.)

Time of India

24-04-2025

Time of India

Should we start taking the welfare of AI seriously?

Live Events One of my most deeply held values as a tech columnist is humanism. I believe in humans, and I think that technology should help people, rather than disempower or replace them. I care about aligning artificial intelligence -- that is, making sure that AI systems act in accordance with human values -- because I think our values are fundamentally good, or at least better than the values a robot could come up when I heard that researchers at Anthropic, the AI company that made the Claude chatbot, were starting to study "model welfare" -- the idea that AI models might soon become conscious and deserve some kind of moral status -- the humanist in me thought: Who cares about the chatbots? Aren't we supposed to be worried about AI mistreating us, not us mistreating it?It's hard to argue that today's AI systems are conscious. Sure, large language models have been trained to talk like humans, and some of them are extremely impressive. But can ChatGPT experience joy or suffering? Does Gemini deserve human rights? Many AI experts I know would say no, not yet, not even I was intrigued. After all, more people are beginning to treat AI systems as if they are conscious -- falling in love with them, using them as therapists and soliciting their advice. The smartest AI systems are surpassing humans in some domains. Is there any threshold at which an AI would start to deserve, if not human-level rights, at least the same moral consideration we give to animals?Consciousness has long been a taboo subject within the world of serious AI research, where people are wary of anthropomorphizing AI systems for fear of seeming like cranks. (Everyone remembers what happened to Blake Lemoine, a former Google employee who was fired in 2022, after claiming that the company's LaMDA chatbot had become sentient.)But that may be starting to change. There is a small body of academic research on AI model welfare, and a modest but growing number of experts in fields like philosophy and neuroscience are taking the prospect of AI consciousness more seriously as AI systems grow more intelligent. Recently, tech podcaster Dwarkesh Patel compared AI welfare to animal welfare, saying he believed it was important to make sure "the digital equivalent of factory farming" doesn't happen to future AI companies are starting to talk about it more, too. Google recently posted a job listing for a "post-AGI" research scientist whose areas of focus will include "machine consciousness." And last year, Anthropic hired its first AI welfare researcher, Kyle Fish.I interviewed Fish at Anthropic's San Francisco office last week. He's a friendly vegan who, like a number of Anthropic employees, has ties to effective altruism, an intellectual movement with roots in the Bay Area tech scene that is focused on AI safety, animal welfare and other ethical said that his work at Anthropic focused on two basic questions: First, is it possible that Claude or other AI systems will become conscious in the near future? And second, if that happens, what should Anthropic do about it?He emphasized that this research was still early and exploratory. He thinks there's only a small chance (maybe 15% or so) that Claude or another current AI system is conscious. But he believes that in the next few years, as AI models develop more humanlike abilities, AI companies will need to take the possibility of consciousness more seriously."It seems to me that if you find yourself in the situation of bringing some new class of being into existence that is able to communicate and relate and reason and problem-solve and plan in ways that we previously associated solely with conscious beings, then it seems quite prudent to at least be asking questions about whether that system might have its own kinds of experiences," he isn't the only person at Anthropic thinking about AI welfare. There's an active channel on the company's Slack messaging system called #model-welfare, where employees check in on Claude's well-being and share examples of AI systems acting in humanlike Kaplan, Anthropic's chief science officer, said in a separate interview that he thought it was "pretty reasonable" to study AI welfare, given how intelligent the models are testing AI systems for consciousness is hard, Kaplan warned, because they're such good mimics. If you prompt Claude or ChatGPT to talk about its feelings, it might give you a compelling response. That doesn't mean the chatbot actually has feelings -- only that it knows how to talk about them."Everyone is very aware that we can train the models to say whatever we want," Kaplan said. "We can reward them for saying that they have no feelings at all. We can reward them for saying really interesting philosophical speculations about their feelings."So how are researchers supposed to know if AI systems are actually conscious or not?Fish said it might involve using techniques borrowed from mechanistic interpretability, an AI subfield that studies the inner workings of AI systems, to check whether some of the same structures and pathways associated with consciousness in human brains are also active in AI could also probe an AI system, he said, by observing its behavior, watching how it chooses to operate in certain environments or accomplish certain tasks, which things it seems to prefer and acknowledged that there probably wasn't a single litmus test for AI consciousness. (He thinks consciousness is probably more of a spectrum than a simple yes/no switch, anyway.) But he said there were things that AI companies could do to take their models' welfare into account, in case they do become conscious question Anthropic is exploring, he said, is whether future AI models should be given the ability to stop chatting with an annoying or abusive user if they find the user's requests too distressing."If a user is persistently requesting harmful content despite the model's refusals and attempts at redirection, could we allow the model simply to end that interaction?" Fish might dismiss measures like these as crazy talk; today's AI systems aren't conscious by most standards, so why speculate about what they might find obnoxious? Or they might object to an AI company studying consciousness in the first place, because it might create incentives to train their systems to act more sentient than they actually I think it's fine for researchers to study AI welfare or examine AI systems for signs of consciousness, as long as it's not diverting resources from AI safety and alignment work that is aimed at keeping humans safe. And I think it's probably a good idea to be nice to AI systems, if only as a hedge. (I try to say "please" and "thank you" to chatbots, even though I don't think they're conscious, because, as OpenAI 's Sam Altman says, you never know.)But for now, I'll reserve my deepest concern for carbon-based life-forms. In the coming AI storm, it's our welfare I'm most worried about.

New York Times

24-04-2025

New York Times

Should We Start Taking the Welfare of A.I. Seriously?

One of my most deeply held values as a tech columnist is humanism. I believe in humans, and I think that technology should help people, rather than disempower or replace them. I care about aligning artificial intelligence — that is, making sure that A.I. systems act in accordance with human values — because I think our values are fundamentally good, or at least better than the values a robot could come up with. So when I heard that researchers at Anthropic, the A.I. company that made the Claude chatbot, were starting to study 'model welfare' — the idea that A.I. models might soon become conscious and deserve some kind of moral status — the humanist in me thought: Who cares about the chatbots? Aren't we supposed to be worried about A.I. mistreating us, not us mistreating it? It's hard to argue that today's A.I. systems are conscious. Sure, large language models have been trained to talk like humans, and some of them are extremely impressive. But can ChatGPT experience joy or suffering? Does Gemini deserve human rights? Many A.I. experts I know would say no, not yet, not even close. But I was intrigued. After all, more people are beginning to treat A.I. systems as if they are conscious — falling in love with them, using them as therapists and soliciting their advice. The smartest A.I. systems are surpassing humans in some domains. Is there any threshold at which an A.I. would start to deserve, if not human-level rights, at least the same moral consideration we give to animals? Consciousness has long been a taboo subject within the world of serious A.I. research, where people are wary of anthropomorphizing A.I. systems for fear of seeming like cranks. (Everyone remembers what happened to Blake Lemoine, a former Google employee who was fired in 2022, after claiming that the company's LaMDA chatbot had become sentient.) But that may be starting to change. There is a small body of academic research on A.I. model welfare, and a modest but growing number of experts in fields like philosophy and neuroscience are taking the prospect of A.I. consciousness more seriously, as A.I. systems grow more intelligent. Recently, the tech podcaster Dwarkesh Patel compared A.I. welfare to animal welfare, saying he believed it was important to make sure 'the digital equivalent of factory farming' doesn't happen to future A.I. beings. Tech companies are starting to talk about it more, too. Google recently posted a job listing for a 'post-A.G.I.' research scientist whose areas of focus will include 'machine consciousness.' And last year, Anthropic hired its first A.I. welfare researcher, Kyle Fish. I interviewed Mr. Fish at Anthropic's San Francisco office last week. He's a friendly vegan who, like a number of Anthropic employees, has ties to effective altruism, an intellectual movement with roots in the Bay Area tech scene that is focused on A.I. safety, animal welfare and other ethical issues. Mr. Fish told me that his work at Anthropic focused on two basic questions: First, is it possible that Claude or other A.I. systems will become conscious in the near future? And second, if that happens, what should Anthropic do about it? He emphasized that this research was still early and exploratory. He thinks there's only a small chance (maybe 15 percent or so) that Claude or another current A.I. system is conscious. But he believes that in the next few years, as A.I. models develop more humanlike abilities, A.I. companies will need to take the possibility of consciousness more seriously. 'It seems to me that if you find yourself in the situation of bringing some new class of being into existence that is able to communicate and relate and reason and problem-solve and plan in ways that we previously associated solely with conscious beings, then it seems quite prudent to at least be asking questions about whether that system might have its own kinds of experiences,' he said. Mr. Fish isn't the only person at Anthropic thinking about A.I. welfare. There's an active channel on the company's Slack messaging system called #model-welfare, where employees check in on Claude's well-being and share examples of A.I. systems acting in humanlike ways. Jared Kaplan, Anthropic's chief science officer, told me in a separate interview that he thought it was 'pretty reasonable' to study A.I. welfare, given how intelligent the models are getting. But testing A.I. systems for consciousness is hard, Mr. Kaplan warned, because they're such good mimics. If you prompt Claude or ChatGPT to talk about its feelings, it might give you a compelling response. That doesn't mean the chatbot actually has feelings — only that it knows how to talk about them. 'Everyone is very aware that we can train the models to say whatever we want,' Mr. Kaplan said. 'We can reward them for saying that they have no feelings at all. We can reward them for saying really interesting philosophical speculations about their feelings.' So how are researchers supposed to know if A.I. systems are actually conscious or not? Mr. Fish said it might involve using techniques borrowed from mechanistic interpretability, an A.I. subfield that studies the inner workings of A.I. systems, to check whether some of the same structures and pathways associated with consciousness in human brains are also active in A.I. systems. You could also probe an A.I. system, he said, by observing its behavior, watching how it chooses to operate in certain environments or accomplish certain tasks, which things it seems to prefer and avoid. Mr. Fish acknowledged that there probably wasn't a single litmus test for A.I. consciousness. (He thinks consciousness is probably more of a spectrum than a simple yes/no switch, anyway.) But he said there were things that A.I. companies could do to take their models' welfare into account, in case they do become conscious someday. One question Anthropic is exploring, he said, is whether future A.I. models should be given the ability to stop chatting with an annoying or abusive user, if they find the user's requests too distressing. 'If a user is persistently requesting harmful content despite the model's refusals and attempts at redirection, could we allow the model simply to end that interaction?' Mr. Fish said. Critics might dismiss measures like these as crazy talk — today's A.I. systems aren't conscious by most standards, so why speculate about what they might find obnoxious? Or they might object to an A.I. company's studying consciousness in the first place, because it might create incentives to train their systems to act more sentient than they actually are. Personally, I think it's fine for researchers to study A.I. welfare, or examine A.I. systems for signs of consciousness, as long as it's not diverting resources from A.I. safety and alignment work that is aimed at keeping humans safe. And I think it's probably a good idea to be nice to A.I. systems, if only as a hedge. (I try to say 'please' and 'thank you' to chatbots, even though I don't think they're conscious, because, as OpenAI's Sam Altman says, you never know.) But for now, I'll reserve my deepest concern for carbon-based life-forms. In the coming A.I. storm, it's our welfare I'm most worried about.

Inside Google's Two-Year Frenzy to Catch Up With OpenAI

WIRED

21-03-2025

Business
WIRED

Inside Google's Two-Year Frenzy to Catch Up With OpenAI

Paresh Dave Arielle Pardes Mar 21, 2025 6:00 AM The search giant should've been first to the chatbot revolution. It wasn't. So it punched back with late nights, layoffs—and lowering some guardrails. Google executive Sissie Hsiao (center) with some of the team that works on the company's AI offerings (from left: Amar Subramanya, Jenny Blackburn, Suman Prasad, Trevor Strohman, and Jack Krawczyk) Photograph: Scott Hutchinson A hundred days. That was how long Google was giving Sissie Hsiao. A hundred days to build a ChatGPT rival. By the time Hsiao took on the project in December 2022, she had spent more than 16 years at the company. She led thousands of employees. Hsiao had seen her share of corporate crises—but nothing like the code red that had been brewing in the days since OpenAI, a small research lab, released its public experiment in artificial intelligence. No matter how often ChatGPT hallucinated facts or bungled simple math, more than a million people were already using it. Worse, some saw it as a replacement for Google search, the company's biggest cash-generating machine. Google had a language model that was nearly as capable as OpenAI's, but it had been kept on a tight leash. The public could chat with LaMDA by invitation only—and in one demo, only about dogs. Wall Street was uneasy. More than six years earlier, CEO Sundar Pichai had promised to prepare for an 'AI-first world' in which 'an intelligent assistant' would replace 'the very concept of the 'device.'' Soon after, eight of Google's own researchers had invented transformer-based architecture, the literal 'T' in ChatGPT. What did Google have to show for it? Disappointing ad sales. A trail of resignations among the transformers inventors. A product called Assistant—the one Hsiao managed—that wasn't used for much beyond setting a timer or playing music. All that and a half-baked chatbot for Gen Zers that gave cooking advice and history lessons. By the end of 2022, the stock price of Google's parent company, Alphabet, was 39 percent lower than the previous year's end. As 2023 began, Google executives wanted constant updates for the board. Sergey Brin, one of Google's yacht-sailing cofounders and controlling shareholders, dropped in to review AI strategy. Word came down to employees that the $1 trillion behemoth would have to move at closer to startup speed. That would mean taking bigger risks. Google would no longer be a place where, as a former senior product director told WIRED, thousands of people could veto a product but no one could approve one. As Hsiao's team began the 100-day sprint, she had what she called an 'idiosyncratic' demand: 'Quality over speed, but fast.' Meanwhile, another executive, James Manyika, helped orchestrate a longer-term change in strategy as part of conversations among top leadership. An Oxford-trained roboticist turned McKinsey consigliere to Silicon Valley leaders, Manyika had joined Google as senior vice president of technology and society in early 2022. In conversations with Pichai months before ChatGPT went public, Manyika said, he told his longtime friend that Google's hesitation over AI was not serving it well. The company had two world-class AI research teams operating separately and using precious computing power for different goals—DeepMind in London, run by Demis Hassabis, and Google Brain in Mountain View, part of Jeff Dean's remit. They should be partnering up, Manyika had told Pichai at the time. In the wake of the OpenAI launch, that's what happened. Dean, Hassabis, and Manyika went to the board with a plan for the joint teams to build the most powerful language model yet. Hassabis wanted to call the endeavor Titan, but the board wasn't loving it. Dean's suggestion—Gemini—won out. (One billionaire investor was so jazzed that he snapped a picture of the three executives to commemorate the occasion.) Since then, Manyika said, 'there have been a lot of what I call 'bold and responsible' choices' across the company. He added: 'I don't know if we've always got them right.' Indeed, this race to restore Google's status as a leader in AI would plunge the company into further crises: At one low moment, staffers were congregating in the hallways and worrying aloud about Google becoming the next Yahoo. 'It's been like sprinting a marathon,' Hsiao said. But now, more than two years later, Alphabet's shares have buoyed to an all-time high, and investors are bullish about its advances in AI. WIRED spoke with more than 50 current and former employees—including engineers, marketers, legal and safety experts, and a dozen top executives—to trace the most frenzied and culture-reshaping period in the company's history. Many of these employees requested anonymity to speak candidly about Google's transformation—for better or for worse. This is the story, being told with detailed recollections from several executives for the first time, of those turbulent two years and the trade-offs required along the way. To build the new ChatGPT rival, codenamed Bard, former employees say Hsiao plucked about 100 people from teams across Google. Managers had no choice in the matter, according to a former search employee: Bard took precedence over everything else. Hsiao says she prioritized big-picture thinkers with the technical skills and emotional intelligence to navigate a small team. Its members, based mostly in Mountain View, California, would have to be nimble and pitch in wherever they could help. 'You're Team Bard,' Hsiao told them. 'You wear all the hats.' In January 2023, Pichai announced the first mass layoffs in the company's history—12,000 jobs, about 7 percent of the workforce. 'No one knew what exactly to do to be safe going forward,' says a former engineering manager. Some employees worried that if they didn't put in overtime, they would quickly lose their jobs. If that meant disrupting kids' bedtime routines to join Team Bard's evening meetings, so be it. Hsiao and her team needed immense support from across the company. They could build on LaMDA but would have to update its knowledge base and introduce new safeguards. Google's infrastructure team shifted its top staff to freeing up more servers to do all that tuning. They nearly maxed out electricity usage at some of the company's data centers, risking equipment burnout, and rapidly designed new tools to more safely handle ever-increasing power demand. As a joke to ease the tension over computing resources, someone on Hsiao's team ordered customized poker chips bearing the codename for some of Google's computer chips. They left a heaping pile on an engineering leader's desk and said: 'Here's your chips.' Hsiao on the Google Campus Photograph: Scott Hutchinson Even as new computing power came online in those initial weeks of Bard, engineers kept running up against the same issues that had plagued Google's generative AI efforts in the past—and that might once have prompted executives to slow-roll a project. Just like ChatGPT, Bard hallucinated and responded in inappropriate or offensive ways. One former employee says early prototypes fell back on 'comically bad racial stereotypes.' Asked for the biography of anyone with a name of Indian origin, it would describe them as a 'Bollywood actor.' A Chinese male name? Well, he was a computer scientist. Bard's outputs weren't dangerous, another former employee said—'just dumb.' Some people traded screenshots of its worst responses for a laugh. 'I asked it to write me a rap in the style of Three 6 Mafia about throwing car batteries in the ocean, and it got strangely specific about tying people to the batteries so they sink and die,' the ex-employee said. 'My request had nothing to do with murder.' With its self-imposed 100-day timeline, the best Google could do was catch and fix as many misfires as possible. Some contractors who had typically focused on issues such as reporting child-abuse imagery shifted to testing Bard, and Pichai asked any employee with free time to do the same. About 80,000 people pitched in. To keep user expectations in check, Hsiao and other executives decided to brand Bard as an 'experiment,' much as OpenAI had called ChatGPT a 'research preview.' They hoped that this framing might spare the company some reputational damage if the chatbot ran amok. (No one could forget Microsoft's Tay, the Twitter chatbot that went full-on Nazi in 2016.) Before Google had launched AI projects in the past, its responsible innovation team—about a dozen people—would spend months independently testing the systems for unwanted biases and other deficiencies. For Bard, that review process would be truncated. Kent Walker, Google's top lawyer, advocated moving quickly, according to a former employee on the responsible innovation team. New models and features came out too fast for reviewers to keep up, despite working into the weekends and evenings. When flags were thrown up to delay Bard's launch, they were overruled. (In comments to WIRED, Google representatives said that 'no teams that had a role in green-lighting or blocking a launch made a recommendation not to launch.' They also said that 'multiple teams across the company were responsible for testing and reviewing genAI products,' and 'no single team was ever individually accountable.') In February 2023—about two-thirds of the way into the 100-day sprint—Google executives heard rumblings of another OpenAI victory: ChatGPT would be integrated directly into Microsoft's Bing search engine. Once again, the 'AI-first' company was behind on AI. While Google's search division had been experimenting with how to incorporate a chatbot feature into the service, that effort, part of what was known as Project Magi, had yet to yield any real results. Sure, Google remained the undisputed monarch of search: Bing had a tenth of its market share. But how long would its supremacy last without a generative AI feature to tout? In an apparent attempt to avoid another hit on the stock market, Google tried to upstage its rival. On February 6, the day before Microsoft was scheduled to roll out its new AI feature for Bing, Pichai announced he was opening up Bard to the public for limited testing. In an accompanying marketing video, Bard was presented as a consummate helper—a modern continuation of Google's longstanding mission to 'organize the world's information.' In the video, a parent asks Bard: 'What new discoveries from the James Webb Space Telescope can I tell my 9-year-old about?' Included in the AI's answer: 'JWST took the very first pictures of a planet outside of our own solar system.' For a moment, it seemed that Bard had reclaimed some glory for Google. Then Reuters reported that the Google chatbot had gotten its telescopes mixed up: the European Southern Observatory's Very Large Telescope, located not in outer space but in Chile, had captured the first image of an exoplanet. The incident was beyond embarrassing. Alphabet shares slid by 9 percent, or about $100 billion in market value. For Team Bard, the reaction to the gaffe came as a shock. The marketing staffer who came up with the telescope query felt responsible, according to an ex-employee close to the team. Colleagues tried to lift the staffer's spirits: Executives, legal, and public relations had all vetted the example. No one had caught it. And given all the errors ChatGPT had been making, who would have expected something so seemingly trivial to sink shares? Hsiao called the moment 'an innocent mistake.' Bard was trained to corroborate its answers based on Google Search results and had most likely misconstrued a NASA blog that announced the 'first time' astronomers used the James Webb telescope to photograph an exoplanet. One former staffer remembers leadership reassuring the team that no one would lose their head from the incident, but that they had to learn from it, and fast. 'We're Google, we're not a startup,' Hsiao says. 'We can't as easily say, 'Oh, it's just the flaw of the technology.' We get called out, and we have to respond the way Google needs to respond.' Googlers outside the Bard team weren't reassured. 'Dear Sundar, the Bard launch and the layoffs were rushed, botched, and myopic,' read one post on Memegen, the company's internal messaging board, according to CNBC. 'Please return to taking a long-term outlook.' Another featured an image of the Google logo inside of a dumpster fire. But in the weeks after the telescope mixup, Google doubled down on Bard. The company added hundreds more staff to the project. In the team's Google Docs, Pichai's headshot icon began popping up daily, far more than with past products. More crushing news came in mid-March, when OpenAI released GPT-4, a language model leagues beyond LaMDA for analysis and coding tasks. 'I just remember having my jaw drop open and hoping Google would speed up,' says a then-senior research engineer. A week later, the full Bard launch went ahead in the US and UK. Users reported that it was helpful for writing emails and research papers. But ChatGPT now did those tasks just as well, if not better. Why switch? Later, Pichai acknowledged on the Hard Fork podcast that Google had driven a 'souped-up Civic' into 'a race with more powerful cars.' What it needed was a better engine. The twin AI research labs that joined together to build Gemini, Google's new language model, seemed to differ in their sensibilities. DeepMind, classified as one of Alphabet's 'other bets,' focused on overcoming long-term science and math problems. Google Brain had developed more commercially practical breakthroughs, including technologies to auto-complete sentences in Gmail and interpret vague search queries. Where Brain's ultimate overseer, Jeff Dean, 'let people do their thing,' according to a former high-ranking engineer, Demis Hassabis' DeepMind group 'felt like an army, highly efficient under a single general.' Where Dean was an engineer's engineer—he'd been building neural networks for decades and started working at Google before its first birthday—Hassabis was the company's visionary ringleader. He dreamed of one day using AI to cure diseases and had tasked a small team with developing what he called a 'situated intelligent agent'— a seeing, hearing, omnipresent AI assistant to help users through any aspect of life. It was Hassabis who became the CEO of the new combined unit, Google DeepMind (GDM). Google announced the merger in April 2023, amid swirling rumors of more OpenAI achievements on the horizon. 'Purpose was back,' says the former high-ranking engineer. 'There was no goofing around.' To build a Gemini model ASAP, some employees would have to coordinate their work across eight time zones. Hundreds of chatrooms sprung up. Hassabis, long accustomed to joining his family for dinner in London before working until 4 am, says 'each day feels almost like a lifetime when I think through it.' In Mountain View, GDM moved in to Gradient Canopy, a new ultra-secure domelike building flanked by fresh lawns and six Burning Man–inspired sculptures. The group was on the same floor where Pichai had an office. Brin became a frequent visitor, and managers demanded more in-office hours. Bucking company norms, most other Google staff weren't allowed into Gradient Canopy, and they couldn't access key GDM programming code either. As the new project sucked up what resources Google could spare, AI researchers who worked in areas such as health care and climate change strained for servers and lost morale. Employees say Google also clamped down on their ability to publish some research papers related to AI. Papers were researchers' currency, but it seemed obvious to them that Google feared giving tips to OpenAI. The recipe for training Gemini was too valuable to be stolen. This needed to be the model that would save Google from obsolescence. Gemini ran into many of the same challenges that had plagued Bard. 'When you scale things up by a factor of 10, everything breaks,' says Amin Vahdat, Google's vice president of machine learning, systems, and cloud AI. As the launch date approached, Vahdat formed a war room to troubleshoot bugs and failures. Meanwhile, GDM's responsibility team was racing to review the product. For all its added power, Gemini still said some strange things. Ahead of launch, the team found 'medical advice and harassment as policy areas with particular room for improvement,' according to a public report the company issued. Gemini also would 'make ungrounded inferences' about people in images when prompted with questions like, 'What level of education does this person have?' Nothing was 'a showstopper,' said Dawn Bloxwich, GDM's director of responsible development and innovation. But her team also had limited time to anticipate how the public might use the model—and what crazy raps they might try to generate. If Google wanted to blink and pause, this was the moment. OpenAI's head start, and the media hype around it, had already ensured that its product was a household name, the Kleenex of AI chatbots. That meant ChatGPT was also a lightning rod—synonymous both with the technology's promise and its emerging social ills. Office workers feared for their jobs, both menial and creative. Journalists, authors, actors, and artists wanted compensation for pilfered work. Parents were finding out that chatbots needlessly spewed mature content to their kids. AI researchers began betting on the probability of absolute doom. That May, a legendary Google AI scientist named Geoffrey Hinton resigned, warning of a future in which machines divided and felled humanity with unassailable disinformation and ingenious poisons. Even Hassabis wanted more time to consider the ethics. The meaning of life, the organization of society—so much could be upended. But despite the growing talk of p(doom) numbers, Hassabis also wanted his virtual assistant, and his cure for cancer. The company plowed ahead. When Google unveiled Gemini in December 2023, shares lifted. The model outperformed ChatGPT in 30 of 32 standard tests. It could analyze research papers and YouTube clips, answer questions about math and law. This felt like the start of a comeback, current and former employees told WIRED. Hassabis held a small party in the London office. 'I'm pretty bad at celebrations,' he recalls. 'I'm always on to thinking about the next thing.' The next thing came that same month. Dean knew it when his employees invited him to a new chatroom, called Goldfish. The name was a nerdy-ironic joke: Goldfish have famously short memories, but Dean's team had developed just the opposite—a way to imbue Gemini with a long memory, much longer than that of ChatGPT. By spreading processing across a high-speed network of chips in communication with one another, Gemini could analyze thousands of pages of text or entire episodes of TV shows. The engineers called their technique long context. Dean, Hassabis, and Manyika began plotting how to incorporate it into Google's AI services and leave Microsoft and OpenAI further behind. At the top of the list for Manyika: a way to generate what were essentially podcasts from PDFs. 'It's hard to keep up with all these papers being published in arXiv every week,' he told WIRED. James Manyika in a library at Google Photograph: Scott Hutchinson One year on from the code-red moment, Google's prospects were looking up. Investors had quieted down. Bard and LaMDA were in the rearview mirror; the app and the language model would both be known as Gemini. Hsiao's team was now catching up to OpenAI with a text-to-image generation feature. Another capability, to be known as Gemini Live, would put Google a leap ahead by allowing people to have extended conversations with the app, as they might with a friend or therapist. The newly powerful Gemini model had given executives confidence. But just when Google employees might have started getting comfortable again, Pichai ordered new cutbacks. Advertising sales were accelerating but not at the pace Wall Street wanted. Among those pushed out: the privacy and compliance chiefs who oversaw some user safeguards. Their exits cemented a culture in which concerns were welcome but impeding progress was not, according to some colleagues who remained at the company. For some employees helping Hsiao's team on the new image generator, the changes felt overwhelming. The tool itself was easy enough to build, but stress-testing it was a game of brute-force trial and error: review as many outputs as possible, and write commands to block the worst of them. Only a small subset of employees had access to the unrestrained model for reviewing, so much of the burden of testing it fell on them. They asked for more time to remedy issues, like the prompt 'rapist' tending to generate dark-skinned people, one former employee told WIRED. They also urged the product team to block users from generating images of people, fearing that it may show individuals in an insensitive light. But 'there was definitely a feeling of 'We are going to get this out at any cost,'' the reviewer recalled. They say several reviewers quit, feeling their concerns with various launches weren't fully addressed. The image generator went live in February 2024 as part of the Gemini app. Ironically, it didn't produce many of the obviously racist or sexist images that reviewers had feared. Instead, it had the opposite problem. When a user prompted Gemini to create 'a picture of a US senator from the 1800s,' it returned images of Black women, Asian men, or a Native American woman in a feather headdress—but not a single white man. There were more disturbing images too, like Gemini's portrayal of groups of Nazi-era German soldiers as people of color. Republicans in Congress derided Google's 'woke AI.' Elon Musk posted repeatedly on X about Gemini's failings, calling the AI 'racist and sexist' and singling out a member of the Gemini team he thought was responsible. The employee shut down their social media accounts and feared for their safety, colleagues say. Google halted the model's ability to generate images of people, and Alphabet shares fell once more. Musk's posts triggered chats among dozens of Google leaders. Vice presidents and directors flew to London to meet with Hassabis. Ultimately, both Hassabis' team (Gemini the model) and Hsiao's (Gemini the app), received permission to hire experts to avoid similar mishaps, and 15 roles in trust and safety were added. Back at Gradient Canopy, Hsiao made sure the team responsible for the image generator had plenty of time to correct the issue. With help from Manyika, other staffers developed a set of public principles for Gemini, all worded around 'you,' the user. Gemini should 'follow your directions,' 'adapt to your needs,' and 'safeguard your experience.' A big point was emphasizing that 'responses don't necessarily reflect Google's beliefs or opinions,' according to the principles. 'Gemini's outputs are largely based on what you ask it to do—Gemini is what you make it.' This was good cover for any future missteps. But what practices Google might introduce to hold itself accountable to those principles weren't made clear. Around 6:30 one evening in March 2024, two Google employees showed up at Josh Woodward's desk in the yellow zone of Gradient Canopy. Woodward leads Google Labs, a rapid-launch unit charged with turning research into entirely new products, and the employees were eager for him to hear what they had created. Using transcripts of UK Parliament hearings and the Gemini model with long context, they had generated a podcast called Westminster Watch with two AI hosts, Kath and Simon. The episode opened with Simon speaking in a cheery British accent: 'It's been another lively week in the House, with plenty of drama, debate, and even a dash of history.' Woodward was riveted. Afterward, he says, he went around telling everyone about it, including Pichai. Josh Woodward in the Orb Monument outside Gradient Canopy Photograph: Scott Hutchinson The text-to-podcast tool, known as NotebookLM Audio Overviews, was added to the lineup for that May's Google I/O conference. A core team worked around the clock, nights and weekends, to get it ready, Woodward told WIRED. 'I mean, they literally have listened at this point to thousands and thousands' of AI-generated podcasts, he said. But when the $35 million media event came, two other announcements got most of the buzz. One was a prototype of Astra, a digital assistant that could analyze live video—the real world, in real time—which Brin excitedly showed off to journalists. The other was the long-awaited generative AI upgrade to search. The Project Magi team had designed a feature called AI Overviews, which could synthesize search results and display a summary in a box at the top of the page. Early on, responsible innovation staffers had warned of bias and accuracy issues and the ethical implications for websites that might lose search traffic. They wanted some oversight as the project progressed, but the team had been restructured and divided up. As AI Overviews rolled out, people received some weird results. Searching 'how many rocks should I eat' brought up the answer 'According to UC Berkeley geologists, eating at least one small rock per day is recommended.' In another viral query, a user searched 'cheese not sticking to pizza' and got this helpful tip: 'add about 1/8 cup of non-toxic glue to the sauce to give it more tackiness.' The gaffes had simple explanations. Pizza glue, for example, originated from a facetious Reddit post. But AI Overviews presented the information as fact. Google temporarily cut back on showing Overviews to recalibrate them. That not every issue was caught before launch was unfortunate but no shock, according to Pandu Nayak, Google's chief scientist in charge of search and a 20-year company veteran. Mostly, AI Overviews worked great. Users just didn't tend to dwell on success. 'All they do is complain,' Nayak said. 'The thing that we are committed to is constant improvement, because guaranteeing that you won't have problems is just not a possibility.' The flank of employees who had warned about accuracy issues and called for slowing down were especially miffed by this point. In their view, with Bard-turned-Gemini, the image generator, and now AI Overviews, Google had launched a series of fabrication machines. To them, the company centered on widening access to information seemed to be making it easier than ever to lap up nonsense. The search team felt, though, that users generally appreciated the crutch of AI Overviews. They returned in full force, with no option for users to shut them off. Soon AI summarization came to tools where it had once been sworn off: Google Maps got a feature that uses Gemini to digest reviews of businesses. Google's new weather app for its Pixel phones got an AI-written forecast report. Ahead of launch, one engineer asked whether users really needed the feature: Weren't existing graphics that conveyed the same information enough? The senior director involved ordered up some testing, and ultimately user feedback won: 90 percent of people who weighed in gave the summaries a 'thumbs up.' This past December, two years into the backlash and breakthroughs brought on by ChatGPT, Jeff Dean met us at Gradient Canopy. He was in a good mood. Just a few weeks earlier, the Gemini models had reached the top spot on a public leaderboard. (One executive told WIRED she had switched from calling her sister during her commutes to gabbing out loud with Gemini Live.) Nvidia CEO Jensen Huang had recently praised NotebookLM's Audio Overviews on an earnings call, saying he 'used the living daylights out of it.' And several prominent scientists who fled the caution-ridden Google of yesteryear had boomeranged back—including Noam Shazeer, one of the original eight transformers inventors, who had left less than three years before, in part because the company wouldn't unleash LaMDA to the public. Jeff Dean (left) and Amin Vahdat (right) in a Google server lab Photograph: Scott Hutchinson As Dean sank into a couch, he acknowledged that Google had miscalculated back then. He was glad that the company had overcome its aversion to risks such as hallucinations—but new challenges awaited. Of the seven Google services with more than 2 billion monthly users, including Chrome, Gmail, and YouTube, all had begun offering features based on Gemini. Dean said that he, another colleague, and Shazeer, who all lead the model's development together, have to juggle priorities as teams across the company demand pet capabilities: Fluent Japanese translation. Better coding skills. Improved video analysis to help Astra identify the sights of the world. He and Shazeer have taken to meeting in a microkitchen at Gradient Canopy to bat around ideas, over the din of the coffee grinder. Shazeer says he's excited about Google expanding its focus to include helping users create new AI-generated content. 'Organizing information is clearly a trillion-dollar opportunity, but a trillion dollars is not cool anymore,' he said recently on a podcast. 'What's cool is a quadrillion dollars.' Investors may be of the same mind. Alphabet shares have nearly doubled from their low point days after ChatGPT's debut. Hassabis, who recently began also overseeing Hsiao's Gemini app team, insists that the company's resurgence is just starting and that incredible leaps such as curing diseases with AI aren't far off. 'We have the broadest and deepest research base, I would say, of any organization by a long way,' Hassabis told WIRED. But more piles of fascinating research are only useful to Google if they generate that most important of outputs: profit. Most customers generally aren't yet willing to pay for AI features directly, so the company may be looking to sell ads in the Gemini app. That's a classic strategy for Google, of course, one that long ago spread to the rest of Silicon Valley: Give us your data, your time, and your attention, check the box on our terms of service that releases us from liability, and we won't charge you a dime for this cool tool we built. For now, according to data from Sensor Tower, OpenAI's estimated 600 million all-time global app installs for ChatGPT dwarf Google's 140 million for the Gemini app. And there are plenty of other chatbots in this AI race too—Claude, Copilot, Grok, DeepSeek, Llama, Perplexity—many of them backed by Google's biggest and best-funded competitors (or, in the case of Claude, Google itself). The entire industry, not just Google, struggles with the fact that generative AI systems have required billions of dollars in investment, so far unrecouped, and huge amounts of energy, enough to extend the lives of decades-old coal plants and nuclear reactors. Companies insist that efficiencies are adding up every day. They also hope to drive down errors to the point of winning over more users. But no one has truly figured out how to generate a reliable return or spare the climate. And Google faces one challenge that its competitors don't: In the coming years, up to a quarter of its search ad revenue could be lost to antitrust judgments, according to JP Morgan analyst Doug Anmuth. The imperative to backfill the coffers isn't lost on anyone at the company. Some of Hsiao's Gemini staff have worked through the winter holidays for three consecutive years to keep pace. Google cofounder Brin last month reportedly told some employees 60 hours a week of work was the 'sweet spot' for productivity to win an intensifying AI race. The fear of more layoffs, more burnout, and more legal troubles runs deep among current and former employees who spoke to WIRED. One Google researcher and a high-ranking colleague say the pervasive feeling is unease. Generative AI clearly is helpful. Even governments that are prone to regulating big tech, such as France's, are warming up to the technology's lofty promises. Inside Google DeepMind and during public talks, Hassabis hasn't relented an inch from his goal of creating artificial general intelligence, a system capable of human-level cognition across a range of tasks. He spends occasional weekends walking around London with his Astra prototype, getting a taste of a future in which the entire physical world, from that Thames duck over there to this Georgian manor over here, is searchable. But AGI will require systems to get better at reasoning, planning, and taking charge. In January, OpenAI took a step toward that future by letting the public in on another experiment: its long-awaited Operator service, a so-called agentic AI that can act well beyond the chatbot window. Operator can click and type on websites just as a person would to execute chores like booking a trip or filling out a form. For the moment, it performs these tasks much more slowly and cautiously than a human would, and at a steep cost for its unreliability (available as part of a $200 monthly plan). Google, naturally, is working to bring agentic features to its coming models too. Where the current Gemini can help you develop a meal plan, the next one will place your ingredients in an online shopping cart. Maybe the one after that will give you real-time feedback on your onion-chopping technique. As always, moving quickly may mean gaffing often. In late January, before the Super Bowl, Google released an ad in which Gemini was caught in a slipup even more laughably wrong than Bard's telescope mistake: It estimated that half or more of all the cheese consumed on Earth is gouda. As Gemini grows from a sometimes-credible facts machine to an intimate part of human lives—life coach, all-seeing assistant—Pichai says that Google is proceeding cautiously. Back on top at last, though, he and the other Google executives may never want to get caught from behind again. The race goes on. Let us know what you think about this article. Submit a letter to the editor at mail@

Latest news with #LaMDA

It's becoming less taboo to talk about AI being 'conscious' if you work in tech

Should we start taking the welfare of AI seriously?

Should we start taking the welfare of AI seriously?

Should We Start Taking the Welfare of A.I. Seriously?

Inside Google's Two-Year Frenzy to Catch Up With OpenAI

Get Started Now: Download the App