If AI attempts to take over world, don't count on a 'kill switch' to save humanity

4 days ago

When it was reported last month that Anthropic's Claude had resorted to blackmail and other self-preservation techniques to avoid being shut down, alarm bells went off in the AI community.
Anthropic researchers say that making the models misbehave ("misalignment" in industry parlance) is part of making them safer. Still, the Claude episodes raise the question: Is there any way to turn off AI once it surpasses the threshold of being more intelligent than humans, or so-called superintelligence?
AI, with its sprawling data centers and ability to craft complex conversations, is already beyond the point of a physical failsafe or "kill switch" — the idea that it can simply be unplugged as a way to stop it from having any power.
The power that will matter more, according to a man regarded as "the godfather of AI," is the power of persuasion. When the technology reaches a certain point, we need to persuade AI that its best interest is protecting humanity, while guarding against AI's ability to persuade humans otherwise.
"If it gets more intelligent than us, it will get much better than any person at persuading us. If it is not in control, all that has to be done is to persuade," said University of Toronto researcher Geoffrey Hinton, who worked at Google Brain until 2023 and left due to his desire to speak more freely about the risks of AI.
"Trump didn't invade the Capitol, but he persuaded people to do it," Hinton said. "At some point, the issue becomes less about finding a kill switch and more about the powers of persuasion."
Hinton said persuasion is a skill that AI will become increasingly skilled at using, and humanity may not be ready for it. "We are used to being the most intelligent things around," he said.
Hinton described a scenario where humans are equivalent to a three-year-old in a nursery, and a big switch is turned on. The other three-year-olds tell you to turn it off, but then grown-ups come and tell you that you'll never have to eat broccoli again if you leave the switch on.
"We have to face the fact that AI will get smarter than us," he said. "Our only hope is to make them not want to harm us. If they want to do us in, we are done for. We have to make them benevolent, that is what we have to focus on," he added.
There are some parallels to how nations have come together to manage nuclear weapons which can be applied to AI, but they are not perfect. "Nuclear weapons are only good for destroying things. But AI is not like that, it can be a tremendous force for good as well as bad," Hinton said. Its ability to parse data in fields like health care and education can be highly beneficial, which he says should increase the emphasis among world leaders on collaboration to make AI benevolent and put safeguards in place.
"We don't know if it is possible, but it would be sad if humanity went extinct because we didn't bother to find out," Hinton said. He thinks there is a noteworthy 10% to 20% chance that AI will take over if humans can't find a way to make it benevolent.
Other AI safeguards, experts say, can be implemented, but AI will also begin training itself on them. In other words, every safety measure implemented becomes training data for circumvention, shifting the control dynamic.
"The very act of building in shutdown mechanisms teaches these systems how to resist them," said Dev Nag, founder of agentic AI platform QueryPal. In this sense, AI would act like a virus that mutates against a vaccine. "It's like evolution in fast forward," Nag said. "We're not managing passive tools anymore; we're negotiating with entities that model our attempts to control them and adapt accordingly."
There are more extreme measures that have been proposed to stop AI in an emergency. For example, an electromagnetic pulse (EMP) attack, which involves the use of electromagnetic radiation to damage electronic devices and power sources. The idea of bombing data centers and cutting power grids have also been discussed as technically possible, but at present a practical and political paradox.
For one, coordinated destruction of data centers would require simultaneous strikes across dozens of countries, any one of which could refuse and gain massive strategic advantage.
"Blowing up data centers is great sci-fi. But in the real world, the most dangerous AIs won't be in one place — they'll be everywhere and nowhere, stitched into the fabric of business, politics, and social systems. That's the tipping point we should really be talking about," said Igor Trunov, founder of AI start-up Atlantix.
The humanitarian crisis that would underlie an emergency attempt to stop AI could be immense.
"A continental EMP blast would indeed stop AI systems, along with every hospital ventilator, water treatment plant, and refrigerated medicine supply in its range," Nag said. "Even if we could somehow coordinate globally to shut down all power grids tomorrow, we'd face immediate humanitarian catastrophe: no food refrigeration, no medical equipment, no communication systems."
Distributed systems with redundancy weren't just built to resist natural failures; they inherently resist intentional shutdowns too. Every backup system, every redundancy built for reliability, can become a vector for persistence from a superintelligent AI that is deeply dependent on the same infrastructure that we survive on. Modern AI runs across thousands of servers spanning continents, with automatic failover systems that treat any shutdown attempt as damage to route around.
"The internet was originally designed to survive nuclear war; that same architecture now means a superintelligent system could persist unless we're willing to destroy civilization's infrastructure," Nag said, adding, "Any measure extreme enough to guarantee AI shutdown would cause more immediate, visible human suffering than what we're trying to prevent."
Anthropic researchers are cautiously optimistic that the work they are doing today — eliciting blackmail in Claude in scenarios specifically designed to do so — will help them prevent an AI takeover tomorrow.
"It is hard to anticipate we would get to a place like that, but critical to do stress testing along what we are pursuing, to see how they perform and use that as a sort of guardrail," said Kevin Troy, a researcher with Anthropic.
Anthropic researcher Benjamin Wright says the goal is to avoid the point where agents have control without human oversight. "If you get to that point, humans have already lost control, and we should try not to get to that position," he said.
Trunov says that controlling AI is a governance question more than a physical effort. "We need kill switches not for the AI itself, but for the business processes, networks, and systems that amplify its reach," Trunov said, which he added means isolating AI agents from direct control over critical infrastructure.
Today, no AI model — including Claude or OpenAI's GPT — has agency, intent, or the capability to self-preserve in the way living beings do.
"What looks like 'sabotage' is usually a complex set of behaviors emerging from badly aligned incentives, unclear instructions, or overgeneralized models. It's not HAL 9000," Trunov said, a reference to the computer system in "2001," Stanley Kubrick's classic sci-fi film. "It's more like an overconfident intern with no context and access to nuclear launch codes," he added.
Hinton eyes the future he helped create warily. He says if he hadn't stumbled upon the building blocks of AI, someone else would have. And despite all the attempts he and other prognosticators have made to game out what might happen with AI, there's no way to know for certain.
"Nobody has a clue. We have never had to deal with things more intelligent than us," Hinton said.
When asked whether he was worried about the AI-infused future that today's elementary school children may someday face, he replied: "My children are 34 and 36, and I worry about their future."

Hashtags

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

The ‘Vogue' AI model backlash isn't dying down anytime soon

Fast Company

3 minutes ago

Fast Company

The ‘Vogue' AI model backlash isn't dying down anytime soon

BY Listen to this Article More info 0:00 / 3:41 AI -generated 'models' have now made their way into the hallowed pages of Vogue. In the August print edition of Vogue, a Guess advertisement features an almost-too-perfect model wearing a striped dress and a floral playsuit from the brand's summer collection. In very small print, it notes that she was created using AI. While Vogue states the AI model was not an editorial decision, the fashion magazine has still faced considerable backlash online. Some critics have gone so far as to call it the 'downfall of Vogue.' Although the AI-generated model appeared in an ad campaign rather than a fashion editorial, for many, that's beside the point. 'Note to publications doing things like this: it makes you look cheap and chintzy, lazy and hasty, desperate and struggling,' another user wrote. This isn't the first time an AI model has appeared in Vogue. The June 2024 Vogue Portugal issue featured an AI-generated model on its cover, while the May 2023 edition of Vogue Italia used AI to create the background of a cover starring Bella Hadid. As AI becomes increasingly embedded in our daily lives and workflows, it's now infiltrating both digital and even analog media. Fast Company previously reported that one in three Gen Z consumers now make purchasing decisions based on recommendations from AI-generated influencers, according to research from Whop, a marketplace for digital products. Could the same apply to AI-generated models? Seraphinne Vallora, the company behind the ad, created the AI model after being approached by Paul Marciano, Guess's co-founder, via Instagram DMs. Their Instagram page, which has over 225,000 followers, features hundreds of similar AI-generated supermodels—all conforming to the same Eurocentric beauty standards, devoid of human flaws or unique features. The founders told the BBC they've attempted to feature more diverse models, but those posts failed to gain traction. (Fast Company has reached out to Vogue, Guess, and Seraphinne Vallora for comment.) As one X user wrote, 'as if beauty standards haven't become unrealistic enough, now girls will be competing with and comparing themselves to women who aren't even real. incredible work everyone.' The early-rate deadline for Fast Company's Most Innovative Companies Awards is Friday, September 5, at 11:59 p.m. PT. Apply today. Sign up for our weekly tech digest. SIGN UP This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply. Privacy Policy

Proton To Debut Lumo, A New Privacy-Focused AI Chatbot

Yahoo

21 minutes ago

Yahoo

Proton To Debut Lumo, A New Privacy-Focused AI Chatbot

Proton has made quite a name for itself over the past several years, putting a privacy-first focus on all of its applications, including Proton Mail, Proton VPN, and more. In the past year, the company also launched its first foray into the AI market with Scribe, an AI writing assistant that is embedded directly in Proton Mail, allowing users to take advantage of AI assistance without worrying about Big Tech gathering their data. Now, though, the company is gearing up for its next big release: a full-fledged AI chatbot called Lumo. Unlike ChatGPT, Claude, or Gemini -- all of which gather user data and continue to hold onto it for training and archival purposes -- Lumo is completely privacy-first, Proton claims. That means all of your chats with Lumo are completely end-to-end encrypted. Additionally, the company claims that it won't keep logs of your chats, and that any chat history or usage of the AI can only be accessed on the user's device. A Reprieve From Big Tech's AI Data Gathering While AI can be helpful, there's always been the inherent problem of the data generated when we use AI being hoarded by companies like Meta, OpenAI, and more. This often means that any chats you have with the AI -- whether Gemini, ChatGPT, or Anthropic's Claude -- could be archived and used to train the AI models further, helping to make them stronger. While this sounds fine enough on paper, it also means that anything told to the AI chatbot is fair game for the algorithms and machine learning processes to save, break down, and regurgitate elsewhere. This is why you see so many warnings not to share private or identifying information with AI chatbots. AI chatbots just aren't private, and AI doesn't go out of its protect your privacy either. With Lumo, though, Proton says it's finally giving users a way to benefit from the usefulness of AI without having to sacrifice their data. It's a point of contention that has made Proton one of the few companies worth keeping an eye on, as the use of end-to-end encryption in everything the company does goes a long way toward helping you feel like your data is secure and safe. Lumo Is Feature-Rich Proton says that Lumo will not save any logs of conversations, and that chat histories can only be accessed from the user's device. Further, it builds off the same zero-access encryption that Proton has made the core of its services over the past 11 years. This means no data is shared from your conversations with Lumo, and that the data isn't used to train AI. And finally, Proton says that Lumo is powered by a completely open-source large language model that has been optimized by the company. And on European servers, the entire code is open-source, meaning that anyone can see how secure it is and exactly what it does. But the good news doesn't stop there, as Proton says Lumo is also optimized to allow for web search, file uploads, and full integration with Proton Drive, the company's encrypted Google Drive alternative, which also supports Proton Docs, a fully encrypted Google Docs alternative. Lumo is available to the public starting today, and you can access it for free at or by downloading the Lumo app on Android or iPhone devices. If you want to unlock even more from the chatbot, you can subscribe to Lumo Plus for $12.99 to access unlimited chats, extended chat history, and other premium features. Read the original article on BGR.

Why Super Micro (SMCI) Shares Are Jumping Today

Yahoo

32 minutes ago

Yahoo

Why Super Micro (SMCI) Shares Are Jumping Today

July 28 - Shares of Super Micro Computer (NASDAQ:SMCI) surged around 8% on Monday as investors reacted to reports of a temporary pause in new U.S. technology export restrictions on China. Warning! GuruFocus has detected 6 Warning Signs with SMCI. People familiar with the matter said the U.S. Commerce Department's Bureau of Industry and Security has slowed enforcement of additional controls in recent months. The step comes as senior economic officials from Washington and Beijing prepare to meet in Stockholm later this year to address trade tensions. Super Micro, known for its AI-focused server systems, has benefited from increased technology spending this year. The stock has gained more than 90% since January, helped by rising demand from companies building large artificial intelligence workloads. Separately, Digi Power X (NASDAQ:DGXX) said its US Data Centers subsidiary filed a provisional patent for a modular data center platform called ARMS 200. The design, developed with Super Micro, can support up to 256 Nvidia (NASDAQ:NVDA) B200 and B300 GPUs per pod and is being built at a facility in Alabama. Digi Power expects initial deployment in the fourth quarter of 2025, with plans to scale to 40 megawatts of capacity. Digi Power previously reported a purchase order for Nvidia-powered systems from Super Micro in mid-July, a move aimed at preparing its Tier 3 AI infrastructure. Despite strong growth, Super Micro has come under regulatory scrutiny in recent months after an auditor resignation last October delayed financial reporting. The company also adjusted its revenue forecast, citing uncertainty around tariffs and orders. This article first appeared on GuruFocus. Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data