LEDs light up in a server rack in a data center.
Picture Alliance | Picture Alliance | Getty Images
When it was reported last month that Anthropic’s Claude had resorted to blackmail and other self-preservation techniques to avoid being shut down, alarm bells went off in the AI community.
Anthropic researchers say that making the models misbehave (“misalignment” in industry parlance) is part of making them safer. Still, the Claude episodes raise the question: Is there any way to turn off AI once it surpasses the threshold of being more intelligent than humans, or so-called superintelligence?
AI, with its sprawling data centers and ability to craft complex conversations, is already beyond the point of a physical failsafe or “kill switch” — the idea that it can simply be unplugged as a way to stop it from having any power.
The power that will matter more, according to a man regarded as “the godfather of AI,” is the power of persuasion. When the technology reaches a certain point, we need to persuade AI that its best interest is protecting humanity, while guarding against AI’s ability to persuade humans otherwise.
“If it gets more intelligent than us, it will get much better than any person at persuading us. If it is not in control, all that has to be done is to persuade,” said University of Toronto researcher Geoffrey Hinton, who worked at Google Brain until 2023 and left due to his desire to speak more freely about the risks of AI.
“Trump didn’t invade the Capitol, but he persuaded people to do it,” Hinton said. “At some point, the issue becomes less about finding a kill switch and more about the powers of persuasion.”
Hinton said persuasion is a skill that AI will become increasingly skilled at using, and humanity may not be ready for it. “We are used to being the most intelligent things around,” he said.
Hinton described a scenario where humans are equivalent to a three-year-old in a nursery, and a big switch is turned on. The other three-year-olds tell you to turn it off, but then grown-ups come and tell you that you’ll never have to eat broccoli again if you leave the switch on.
“We have to face the fact that AI will get smarter than us,” he said. “Our only hope is to make them not want to harm us. If they want to do us in, we are done for. We have to make them benevolent, that is what we have to focus on,” he added.
There are some parallels to how nations have come together to manage nuclear weapons which can be applied to AI, but they are not perfect. “Nuclear weapons are only good for destroying things. But AI is not like that, it can be a tremendous force for good as well as bad,” Hinton said. Its ability to parse data in fields like health care and education can be highly beneficial, which he says should increase the emphasis among world leaders on collaboration to make AI benevolent and put safeguards in place.
“We don’t know if it is possible, but it would be sad if humanity went extinct because we didn’t bother to find out,” Hinton said. He thinks there is a noteworthy 10% to 20% chance that AI will take over if humans can’t find a way to make it benevolent.
Geoffrey Hinton, Godfather of AI, University of Toronto, on Centre Stage during day two of Collision 2023 at Enercare Centre in Toronto, Canada.
Ramsey Cardy | Sportsfile | Getty Images
Other AI safeguards, experts say, can be implemented, but AI will also begin training itself on them. In other words, every safety measure implemented becomes training data for circumvention, shifting the control dynamic.
“The very act of building in shutdown mechanisms teaches these systems how to resist them,” said Dev Nag, founder of agentic AI platform QueryPal. In this sense, AI would act like a virus that mutates against a vaccine. “It’s like evolution in fast forward,” Nag said. “We’re not managing passive tools anymore; we’re negotiating with entities that model our attempts to control them and adapt accordingly.”
There are more extreme measures that have been proposed to stop AI in an emergency. For example, an electromagnetic pulse (EMP) attack, which involves the use of electromagnetic radiation to damage electronic devices and power sources. The idea of bombing data centers and cutting power grids have also been discussed as technically possible, but at present a practical and political paradox.
For one, coordinated destruction of data centers would require simultaneous strikes across dozens of countries, any one of which could refuse and gain massive strategic advantage.
“Blowing up data centers is great sci-fi. But in the real world, the most dangerous AIs won’t be in one place — they’ll be everywhere and nowhere, stitched into the fabric of business, politics, and social systems. That’s the tipping point we should really be talking about,” said Igor Trunov, founder of AI start-up Atlantix.
How any attempt to stop AI could ruin humanity
The humanitarian crisis that would underlie an emergency attempt to stop AI could be immense.
“A continental EMP blast would indeed stop AI systems, along with every hospital ventilator, water treatment plant, and refrigerated medicine supply in its range,” Nag said. “Even if we could somehow coordinate globally to shut down all power grids tomorrow, we’d face immediate humanitarian catastrophe: no food refrigeration, no medical equipment, no communication systems.”
Distributed systems with redundancy weren’t just built to resist natural failures; they inherently resist intentional shutdowns too. Every backup system, every redundancy built for reliability, can become a vector for persistence from a superintelligent AI that is deeply dependent on the same infrastructure that we survive on. Modern AI runs across thousands of servers spanning continents, with automatic failover systems that treat any shutdown attempt as damage to route around.
“The internet was originally designed to survive nuclear war; that same architecture now means a superintelligent system could persist unless we’re willing to destroy civilization’s infrastructure,” Nag said, adding, “Any measure extreme enough to guarantee AI shutdown would cause more immediate, visible human suffering than what we’re trying to prevent.”

Anthropic researchers are cautiously optimistic that the work they are doing today — eliciting blackmail in Claude in scenarios specifically designed to do so — will help them prevent an AI takeover tomorrow.
“It is hard to anticipate we would get to a place like that, but critical to do stress testing along what we are pursuing, to see how they perform and use that as a sort of guardrail,” said Kevin Troy, a researcher with Anthropic.
Anthropic researcher Benjamin Wright says the goal is to avoid the point where agents have control without human oversight. “If you get to that point, humans have already lost control, and we should try not to get to that position,” he said.
Trunov says that controlling AI is a governance question more than a physical effort. “We need kill switches not for the AI itself, but for the business processes, networks, and systems that amplify its reach,” Trunov said, which he added means isolating AI agents from direct control over critical infrastructure.
Today, no AI model — including Claude or OpenAI’s GPT — has agency, intent, or the capability to self-preserve in the way living beings do.
“What looks like ‘sabotage’ is usually a complex set of behaviors emerging from badly aligned incentives, unclear instructions, or overgeneralized models. It’s not HAL 9000,” Trunov said, a reference to the computer system in “2001,” Stanley Kubrick’s classic sci-fi film. “It’s more like an overconfident intern with no context and access to nuclear launch codes,” he added.
Hinton eyes the future he helped create warily. He says if he hadn’t stumbled upon the building blocks of AI, someone else would have. And despite all the attempts he and other prognosticators have made to game out what might happen with AI, there’s no way to know for certain.
“Nobody has a clue. We have never had to deal with things more intelligent than us,” Hinton said.
When asked whether he was worried about the AI-infused future that today’s elementary school children may someday face, he replied: “My children are 34 and 36, and I worry about their future.”