logo
#

Latest news with #CenterforAISafety

'Talking to God and angels via ChatGPT.'
'Talking to God and angels via ChatGPT.'

The Verge

time05-05-2025

  • Science
  • The Verge

'Talking to God and angels via ChatGPT.'

Adi Robertson Miles Klee at Rolling Stone reported out a widely circulated Reddit post on 'ChatGPT-induced psychosis': Sycophancy itself has been a problem in AI for 'a long time,' says Nate Sharadin, a fellow at the Center for AI Safety ... What's likely happening with those experiencing ecstatic visions through ChatGPT and other models, he speculates, 'is that people with existing tendencies toward experiencing various psychological issues,' including what might be recognized as grandiose delusions in clinical sense, 'now have an always-on, human-level conversational partner with whom to co-experience their delusions.'

Ex-Google CEO Eric Schmidt says an AI 'Manhattan Project' is a bad idea
Ex-Google CEO Eric Schmidt says an AI 'Manhattan Project' is a bad idea

Yahoo

time06-03-2025

  • Business
  • Yahoo

Ex-Google CEO Eric Schmidt says an AI 'Manhattan Project' is a bad idea

Former Google CEO Eric Schmidt co-authored a paper warning the US about the dangers of an AI Manhattan Project. In the paper, Schmidt, Dan Hendrycks, and Alexandr Wang push for a more defensive approach. The authors suggest the US sabotage rival projects, rather than advance the AI frontier alone. Some of the biggest names in AI tech say an AI "Manhattan Project" could have a destabalizing effect on the US, rather than help safeguard it. The dire warning came from former Google CEO Eric Schmidt, Center for AI Safety director Dan Hendrycks, and Scale AI CEO Alexandr Wang. They coauthored a policy paper titled "Superintelligence Strategy" published on Wednesday. In the paper, the tech titans urge the US to stay away from an aggressive push to develop superintelligent AI, or AGI, which the authors say could provoke international retaliation. China, in particular, "would not sit idle" while the US worked to actualize AGI, and "risk a loss of control," they write. The authors write that circumstances similar to the nuclear arms race that birthed the Manhattan Project — a secretive initiative that ended in the creation of the first atom bomb — have developed around the AI frontier. In November 2024, for example, a bipartisan congressional committee called for a "Manhattan Project-like" program, dedicated to pumping funds into initiatives that could help the US beat out China in the race to AGI. And just a few days before the authors released their paper, US Secretary of Energy Chris Wright said the country is already "at the start of a new Manhattan Project." "The Manhattan Project assumes that rivals will acquiesce to an enduring imbalance or omnicide rather than move to prevent it," the authors write. "What begins as a push for a superweapon and global control risks prompting hostile countermeasures and escalating tensions, thereby undermining the very stability the strategy purports to secure." It's not just the government subsidizing AI advancements, either, according to Schmidt, Hendrycks, and Wang — private corporations are developing "Manhattan Projects" of their own. Demis Hassabis, CEO of Google DeepMind, has said he loses sleep over the possibility of ending up like Robert Oppenheimer. "Currently, a similar urgency is evident in the global effort to lead in AI, with investment in AI training doubling every year for nearly the past decade," the authors say. "Several 'AI Manhattan Projects' aiming to eventually build superintelligence are already underway, financed by many of the most powerful corporations in the world." The authors argue that the US already finds itself operating under conditions similar to mutually assured destruction, which refers to the idea that no nation with nuclear weapons will use its arsenal against another, for fear of retribution. They write that a further effort to control the AI space could provoke retaliation from rival global powers. Instead, the paper suggests the US could benefit from taking a more defensive approach — sabotaging "destabilizing" AI projects via methods like cyberattacks, rather than rushing to perfect their own. In order to address "rival states, rogue actors, and the risk of losing control" all at once, the authors put forth a threefold strategy. Deterring via sabotage, restricting access of chips and "weaponizable AI systems" to "rogue actors," and guaranteeing US access to AI chips via domestic manufacturing. Overall, Schmidt, Hendrycks, and Wang push for balance, rather than what they call the "move fast and break things" strategy. They argue that the US has an opportunity to take a step back from the urgent rush of the arms race, and shift to a more defensive strategy. "By methodically constraining the most destabilizing moves, states can guide AI toward unprecedented benefits rather than risk it becoming a catalyst of ruin," the authors write. Read the original article on Business Insider

Former Google CEO Eric Schmidt sounds the alarm over a ‘Manhattan Project' for superintelligent AI
Former Google CEO Eric Schmidt sounds the alarm over a ‘Manhattan Project' for superintelligent AI

Yahoo

time06-03-2025

  • Business
  • Yahoo

Former Google CEO Eric Schmidt sounds the alarm over a ‘Manhattan Project' for superintelligent AI

Eric Schmidt, Scale AI CEO Alexandr Wang, and Center for AI Safety Director Dan Hendrycks are warning that treating the global AI arms race like the Manhattan Project could backfire. Instead of reckless acceleration, they propose a strategy of deterrence, transparency, and international cooperation—before superhuman AI spirals out of control. Former Google CEO Eric Schmidt, Scale AI CEO Alexandr Wang, and Center for AI Safety Director Dan Hendrycks are sounding the alarm about the global race to build superintelligent AI. In a new paper titled Superintelligence Strategy, Schmidt and his co-authors argue that the U.S. should not pursue the development of artificial general intelligence (AGI) through a government-backed, Manhattan Project-style push. The fear is that a high-stakes race to build superintelligent AI could lead to dangerous global conflicts between the superpowers, much like the nuclear arms race. "The Manhattan Project assumes that rivals will acquiesce to an enduring imbalance or omnicide rather than move to prevent it," the co-authors wrote. "What begins as a push for a superweapon and global control risks prompting hostile countermeasures and escalating tensions, thereby undermining the very stability the strategy purports to secure." The paper comes as U.S. policymakers consider a large-scale, state-funded AI project to compete with China's AI efforts. Last year, a U.S. congressional commission proposed a 'Manhattan Project-style' effort to fund the development of AI systems with superhuman intelligence, modeled after America's atomic bomb program in the 1940s. Since then, the Trump administration has announced a $500 billion investment in AI infrastructure, called the "Stargate Project," and rolled back AI regulations brought in by the previous administration. Earlier this month, U.S. Secretary of Energy Chris Wright also appeared to promote the idea by saying the country was "at the start of a new Manhattan Project" and that, with President Trump's leadership, "the United States will win the global AI race." The authors argue that AI development should be handled with extreme caution, not in a race to out-compete global rivals. The paper lays out the risks of approaching AI development as an all-or-nothing battle for dominance. Schmidt and his co-authors argue that instead of a high-stakes race, AI should be developed through broadly distributed research with collaboration across governments, private companies, and academia. They emphasize that transparency and international cooperation are critical to ensuring that AI benefits humanity rather than becoming an uncontrollable force. Schmidt has addressed the threats posed by a global AI race before. In a January Washington Post op-ed, Schmidt called for the US to invest in open source AI efforts to combat China's DeepSeek. The authors suggest a new concept—Mutual Assured AI Malfunction (MAIM)—modeled on the nuclear arms race's Mutually Assured Destruction (MAD). "Just as nations once developed nuclear strategies to secure their survival, we now need a coherent superintelligence strategy to navigate a new period of transformative change," the authors wrote. "We introduce the concept of Mutual Assured AI Malfunction (MAIM): a deterrence regime resembling nuclear mutual assured destruction (MAD) where any state's aggressive bid for unilateral AI dominance is met with preventive sabotage by rivals," they said. The paper also suggests countries engage in nonproliferation and deterrence, much like they do with nuclear weapons. "Taken together, the three-part framework of deterrence, nonproliferation, and competitiveness outlines a robust strategy to superintelligence in the years ahead," they said. This story was originally featured on

Elon Musk's xAI Is Exploring a Way to Make AI More Like Donald Trump
Elon Musk's xAI Is Exploring a Way to Make AI More Like Donald Trump

WIRED

time11-02-2025

  • Business
  • WIRED

Elon Musk's xAI Is Exploring a Way to Make AI More Like Donald Trump

Feb 11, 2025 3:03 PM A researcher affiliated with Elon Musk's xAI startup is researching a method to alter the politics of Grok and better align the chatbot with conservatives. Photo-Illustration:A researcher affiliated with Elon Musk's startup xAI has found a new way to both measure and manipulate entrenched preferences and values expressed by artificial intelligence models—including their political views. The work was led by Dan Hendrycks, director of the nonprofit Center for AI Safety and an adviser to xAI. He suggests that the technique could be used to make popular AI models better reflect the will of the electorate. 'Maybe in the future, [a model] could be aligned to the specific user,' Hendrycks told WIRED. But in the meantime, he says, a good default would be using election results to steer the views of AI models. He's not saying a model should necessarily be 'Trump all the way,' but he argues it should be biased toward Trump slightly, 'because he won the popular vote.' xAI issued a new AI risk framework on February 10 stating that Hendrycks' utility engineering approach could be used to assess Grok. Hendrycks led a team from the Center for AI Safety, UC Berkeley, and the University of Pennsylvania that analyzed AI models using a technique borrowed from economics to measure consumers' preferences for different goods. By testing models across a wide range of hypothetical scenarios, the researchers were able to calculate what's known as a utility function, a measure of the satisfaction that people derive from a good or service. This allowed them to measure the preferences expressed by different AI models. The researchers determined that they were often consistent rather than haphazard, and showed that these preferences become more ingrained as models get larger and more powerful. Some research studies have found that AI tools such as ChatGPT are biased towards views expressed by pro-environmental, left-leaning, and libertarian ideologies. In February 2024, Google faced criticism from Musk and others after its Gemini tool was found to be predisposed to generate images that critics branded as 'woke," such as Black vikings and Nazis. The technique developed by Hendrycks and his collaborators offers a new way to determine how AI models' perspectives may differ from its users. Eventually, some experts hypothesize, this kind of divergence could become potentially dangerous for very clever and capable models. The researchers show in their study, for instance, that certain models consistently value the existence of AI above that of certain nonhuman animals. The researchers say they also found that models seem to value some people over others, raising its own ethical questions. Some researchers, Hendrycks included, believe that current methods for aligning models, such as manipulating and blocking their outputs, may not be sufficient if unwanted goals lurk under the surface within the model itself. 'We're gonna have to confront this,' Hendrycks says. 'You can't pretend it's not there.' Dylan Hadfield-Menell, a professor at MIT who researches methods for aligning AI with human values, says Hendrycks' paper suggests a promising direction for AI research. 'They find some interesting results,' he says. 'The main one that stands out is that as the model scale increases, utility representations get more complete and coherent.' Hadfield-Menell cautions, however, against drawing too many conclusions about current models. 'This work is preliminary,' he adds. 'I'd want to see broader scrutiny on the results before drawing strong conclusions.' Hendrycks and his colleagues measured the political outlook of several prominent AI models, including xAI's Grok, OpenAI's GPT-4o, and Meta's Llama 3.3. Using their technique they were able to compare the values of different models to the policies of specific politicians, including Donald Trump, Kamala Harris, Bernie Sanders, and Republican Representative Marjorie Taylor Greene. All were much closer to former president Joe Biden than any of the other politicians. The researchers propose a new way to alter a model's behavior by changing its underlying utility functions instead of imposing guardrails that block certain outputs. Using this approach, Hendrycks and his coauthors develop what they call a Citizen Assembly. This involves collecting US census data on political issues and using the answers to shift the values of an open-source model LLM. The result is a model with values that are consistently closer to those of Trump than those of Biden. Some AI researchers have previously sought to make AI models with less liberal bias. In February 2023, David Rozado, an independent AI researcher, developed RightWingGPT, a model trained with data from right-leaning books and other sources. Rozado describes Hendrycks' study as 'very interesting and in-depth work.' He adds: 'The Citizens Assembly approach to molding AI behavior is also thought-provoking.' What sorts of biases have you noticed in your conversations with chatbots? Share your examples and thoughts in the comments below.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store