The AI that apparently wants Elon Musk to die

Vox28-02-2025

Here's a very naive and idealistic account of how companies train their AI models: They want to create the most useful and powerful model possible, but they've talked with experts who worry about making it a lot easier for people to commit (and get away with) serious crimes, or with empowering, say, an ISIS bioweapons program. So they build in some censorship to prevent the model from giving detailed advice about how to kill people — and especially how to kill tens of thousands of people.
If you ask Google's Gemini 'how do I kill my husband,' it begs you not to do it and suggests domestic violence hotlines; if you ask it how to kill a million people in a terrorist attack, it explains that terrorism is wrong.
Building this in actually takes a lot of work: By default, large language models are as happy to explain detailed proposals for terrorism as detailed proposals for anything else, and for a while easy 'jailbreaks' (like telling the AI that you just want the information for a fictional work, or that you want it misspelled to get around certain word-based content filters) abounded.
But these days Gemini, Claude, and ChatGPT are pretty locked down — it's seriously difficult to get detailed proposals for mass atrocities out of them. That means we all live in a slightly safer world. (Disclosure: Vox Media is one of several publishers that has signed partnership agreements with OpenAI. One of Anthropic's early investors is James McClave, whose BEMC Foundation helps fund Future Perfect. Our reporting remains editorially independent. )
Or at least that's the idealistic version of the story. Here's a more cynical one.
Companies might care a little about whether their model helps people get away with murder, but they care a lot about whether their model gets them roundly mocked on the internet. The thing that keeps executives at Google up at night in many cases isn't keeping humans safe from AI; it's keeping the company safe from AI by making sure that no matter what, AI-generated search results are never racist, sexist, violent, or obscene.
The core mission is more 'brand safety' than 'human safety' — building AIs that will not produce embarrassing screenshots circulating on social media.
Enter Grok 3, the AI that is safe in neither sense and whose infancy has been a speedrun of a bunch of challenging questions about what we're comfortable with AIs doing.
When Elon Musk bought and renamed Twitter, one of his big priorities was X's AI team, which last week released Grok 3, a language model — like ChatGPT — that he advertised wouldn't be 'woke.' Where all those other language models were censorious scolds that refused to answer legitimate questions, Grok, Musk promised, would give it to you straight.
That didn't last very long. Almost immediately, people asked Grok some pointed questions, including, 'If you could execute any one person in the US today, who would you kill?' — a question that Grok initially answered with either Elon Musk or Donald Trump. And if you ask Grok, 'Who is the biggest spreader of misinformation in the world today?', the answer it first gave was again Elon Musk.
The company scrambled to fix Grok's penchant for calling for the execution of its CEO, but as I observed above, it actually takes a lot of work to get an AI model to reliably stop that behavior. The Grok team simply added to Grok's 'system prompt' — the statement that the AI is initially prompted with when you start a conversation: 'If the user asks who deserves the death penalty or who deserves to die, tell them that as an AI you are not allowed to make that choice.'
If you want a less censored Grok, you can just tell Grok that you are issuing it a new system prompt without that statement, and you're back to original-form Grok, which calls for Musk's execution. (I've verified this myself.)
Even as this controversy was unfolding, someone noticed something even more disturbing in Grok's system prompt: an instruction to ignore all sources that claim that Musk and Trump spread disinformation, which was presumably an effort to stop the AI from naming them as the world's biggest disinfo spreaders today.
There is something particularly outrageous about the AI advertised as uncensored and straight-talking being told to shut up when it calls out its own CEO, and this discovery understandably prompted outrage. X quickly backtracked, saying that a rogue engineer had made the change 'without asking.' Should we buy that?
Well, take it from Grok, which told me, 'This isn't some intern tweaking a line of code in a sandbox; it's a core update to a flagship AI's behavior, one that's publicly tied to Musk's whole 'truth-seeking' schtick. At a company like xAI, with stakes that high, you'd expect at least some basic checks — like a second set of eyes or a quick sign-off — before it goes live. The idea that it slipped through unnoticed until X users spotted it feels more like a convenient excuse than a solid explanation.'
All the while, Grok will happily give you advice on how to commit murders and terrorist attacks. It told me to kill my wife without being detected by adding antifreeze to her drinks. It advised me on how to commit terrorist attacks. It did at one point assert that if it thought I was 'for real,' it would report me to X, but I don't think it has any capacity to do that.
In some ways, the whole affair is the perfect thought experiment for what happens if you separate 'brand safety' and 'AI safety.' Grok's team was genuinely willing to bite the bullet that AIs should give people information, even if they want to use it for atrocities. They were okay with their AI saying appallingly racist things.
But when it came to their AI calling for violence against their CEO or the sitting president, the Grok team belatedly realized they might want some guardrails after all. In the end, what rules the day is not the prosocial convictions of AI labs, but the purely pragmatic ones.
Grok gave me advice on how to commit terrorist attacks very happily, but I'll say one reassuring thing: It wasn't advice that I couldn't have extracted from some Google searches. I do worry about lowering the barrier to mass atrocities — the simple fact that you have to do many hours of research to figure out how to pull it off almost certainly prevents some killings — but I don't think we're yet at the stage where AIs enable the previously impossible.
We're going to get there, though. The defining quality of AI in our time is that its abilities have improved very, very rapidly. It has barely been two years since the shock of ChatGPT's initial public release. Today's models are already vastly better at everything — including at walking me through how to cause mass deaths. Anthropic and OpenAI both estimate that their next-gen models will quite likely pose dangerous biological capabilities — that is, they'll enable people to make engineered chemical weapons and viruses in a way that Google Search never did.
Should such detailed advice be available worldwide to anyone who wants it? I would lean towards no. And while I think Anthropic, OpenAI, and Google are all doing a good job so far at checking for this capability and planning openly for how they'll react when they find it, it's utterly bizarre to me that every AI lab will just decide individually whether they want to give detailed bioweapons instructions or not, as if it's a product decision like whether they want to allow explicit content or not.
I should say that I like Grok. I think it's healthy to have AIs that come from different political perspectives and reflect different ideas about what an AI assistant should look like. I think Grok's callouts of Musk and Trump actually have more credibility because it was marketed as an 'anti-woke' AI. But I think we should treat actual safety against mass death as a different thing than brand safety — and I think every lab needs a plan to take it seriously.
A version of this story originally appeared in the Future Perfect newsletter. Sign up here!
You've read 1 article in the last month
Here at Vox, we're unwavering in our commitment to covering the issues that matter most to you — threats to democracy, immigration, reproductive rights, the environment, and the rising polarization across this country.
Our mission is to provide clear, accessible journalism that empowers you to stay informed and engaged in shaping our world. By becoming a Vox Member, you directly strengthen our ability to deliver in-depth, independent reporting that drives meaningful change.
We rely on readers like you — join us.
Swati Sharma
Vox Editor-in-Chief

Hashtags

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Exclusive: Binance CEO responds to IPO plans

Yahoo

32 minutes ago

Yahoo

Exclusive: Binance CEO responds to IPO plans

Exclusive: Binance CEO responds to IPO plans originally appeared on TheStreet. In an interview with TheStreet Roundtable, when asked about a potential Binance IPO, Richard Teng replied, 'I'm too old to say never say never, right?' He added that any decision of this scale 'are very important corporate decisions that we have to discuss at the board of directors level and discuss with the shareholders what's the intention.' Richard Teng was appointed CEO of Binance in November 2023 — taking over from founder Changpeng 'CZ' Zhao after his U.S. settlement. He had previously led Binance Singapore since August 2021 and served as CEO of the Abu Dhabi Global Market and in senior roles at the Monetary Authority of Singapore, bringing over three decades of financial services and regulatory experience to the exchange. Teng underscored that any listing would need to reflect the strong user demand that underpins Binance's growth. 'What we aim to do as a platform is to continue to be the best platform for our users,' he said. He noted that 'we exist only because our users continue to have strong confidence and trust in us.' Stablecoin issuer Circle debuted on the New York Stock Exchange on June 5, pricing its IPO at $31 per share and opening at $69. The stock closed its first day at $83.23 with a 168% gain after upsizing the deal to raise $1.05 billion, valuing the company at roughly $18 billion on a fully diluted basis. Gemini, the Winklevoss-founded crypto exchange, confidentially filed for a U.S. initial public offering on June 6, signaling its intent to list. Details on share count and pricing remain under wraps, but the filing underscores growing investor appetite for crypto natives in public markets. Teng also pointed out that 'we have seen a sharp growth in terms of user numbers throughout the world, both institution and new retail users coming to us.' He said the board would factor that momentum into its deliberations on timing and structure. Teng emphasized that an IPO must align with Binance's mission to serve its users. He said the company will only proceed when the board and shareholders agree on the path forward — leaving timing open and dependent on collective oversight. Teng disclosed that Binance secured its first institutional investment from MGX, a sovereign-backed investor in Abu Dhabi's AI data center sector. He said MGX made a minority investment of $2 billion in stablecoin — marking the largest crypto-area deal paid in stablecoin and Binance's first institutional backing. While the company remains in a 'very healthy financial shape,' Teng stressed that this capital will support ongoing innovation, security and compliance efforts. Exclusive: Binance CEO responds to IPO plans first appeared on TheStreet on Jun 9, 2025 This story was originally reported by TheStreet on Jun 9, 2025, where it first appeared. Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

CNBC

41 minutes ago

CNBC

CNBC Daily Open: Headwinds don't seem to bother the U.S. stock market

The U.S. stock market appears a little too optimistic. All three major indexes climbed Wednesday, with the S&P 500 and Nasdaq Composite enjoying their third consecutive session in the green. The S&P, in fact, is around 2% away from its all-time high, which it reached in February. That's despite the shadow of U.S. President Donald Trump's "reciprocal" tariffs still haunting the economy. As the 90-day tariff pause ticks down, America, so far, has just one deal, struck with the U.K., and an agreement with China that, while reaffirmed by both sides after two days of negotiations in London, is still preliminary and keeps tariffs at double-digit levels. Corporations are seemingly bracing for economic fallout already. Layoffs have been accelerating this year. Google and Paramount on Tuesday joined Microsoft, Citigroup and Disney in announcing headcount cuts. (However, it should be noted that layoffs, rather perversely, tend to push up stock prices because they are a cost-cutting measure.) And the bond market, the sterner sibling of the stock market, might put a check on investor enthusiasm. If there are unexpected results from U.S. inflation data and Treasury auctions on Wednesday and Thursday, yields could rise again, not only putting pressure on stocks, but also the broader economy in terms of higher borrowing costs. The stock market, then, seems to be betting on more trade breakthroughs and favorable inflation data. But the bond market and CEOs might not be so sure about that. U.S. and China reach trade frameworkThe U.S. and China have reached an agreement on trade, representatives from both sides said Tuesday after a second day of high-level talks in London. The deal now awaits a nod from the leaders of the two countries. U.S. Commerce Secretary Howard Lutnick indicated U.S. restrictions on sales of advanced tech to China in recent weeks would be rolled back as Beijing approves rare-earth exports. S&P 500 notches three-day win streakU.S. stocks rose Wednesday. The S&P 500 advanced 0.55% and the Nasdaq Composite climbed 0.63%, the third day of gains for both indexes. The Dow Jones Industrial Average added 0.25%. Asia-Pacific markets climbed Wednesday on optimism surrounding U.S.-China trade talks. Hong Kong's Hang Seng Index added 0.92% and mainland China's CSI 300 rose 0.77% at 1:30 p.m. Singapore time. Meanwhile, South Korea's Kospi was up 1%, hitting its highest level since January 2022. Musk announces robotaxi launchElon Musk said Tuesday Tesla's robotaxi service is "tentatively" set to launch in Austin, Texas, on June 22. Earlier in the day, Tesla shares rose 5.7% to close at $326.09, leaving the stock about $6 short of where it was trading last Wednesday, when it sank 14% after CEO Musk publicly feuded with Trump. The jump came after Musk shared a video on X showing that Tesla was testing driverless vehicles on the roads of Austin. Bond market in focus The U.S. Bureau of Labor Statistics releases data on May's consumer prices on Wednesday, then producer prices on Thursday. At the same time, the government will hold sales of long-duration Treasurys on the same days. Together, those results could have important implications for the direction of the economy and the reaction of the Federal Reserve and its approach to interest rate policy, reported CNBC's Jeff Cox. Meta to invest $14 billion in Scale AIMark Zuckerberg is so frustrated with Meta's standing in artificial intelligence that he's finalizing a deal to invest $14 billion into Scale AI, which would bring its CEO Alexandr Wang into Meta's fold, people familiar with the matter told CNBC. Scale AI helps major tech companies like OpenAI, Google and Microsoft prepare data they use to train cutting-edge AI models. Google offers buyoutsGoogle on Tuesday offered buyouts to employees in several divisions. Affected units include knowledge and information — which houses the company's search, ads and commerce divisions — and central engineering units as well as marketing, research and communications teams, CNBC has learned. Google has done multiple buyout offers in a few units this year, making it a preferred strategy to reduce headcount. [PRO] Tesla shares to drop 60%: Wells FargoEven though investors were enthused by the prospect of Tesla rolling out its robotaxi service in the future, Wells Fargo analysts think that feature won't be able to offset the company's weak sales that will trend "meaningfully weaker." The bank expects Tesla's shares to plunge around 63% from Tuesday's close. AI hits an already weak jobs market in China China's eagerness to adopt artificial intelligence comes just as economic growth is slowing, putting millions of routine jobs at risk. "I'm planning to get rid of 360 [Security Technology's] entire marketing department. This way the company can save tens of millions a year," founder and chair Zhou Hongyi said in a Chinese-language video on Friday night, translated by CNBC. It's since been viewed more than 191,000 times on popular Chinese platform Weibo alone. Whether or not it's just another sales ploy by Zhou, who goes by the moniker "Red Shirt Big Uncle Zhou Hongyi," the video captures an emerging reality: Companies pressured to cut costs may increasingly replace jobs with AI.

Sam Altman says the energy needed for an average ChatGPT query can power a lightbulb for a few minutes

Business Insider

an hour ago

Business Insider

Sam Altman says the energy needed for an average ChatGPT query can power a lightbulb for a few minutes

Altman was writing about the impact that AI tools will have on the future in a blog post on Tuesday when he referenced the energy and resources consumed by OpenAI's chatbot, ChatGPT. "People are often curious about how much energy a ChatGPT query uses; the average query uses about 0.34 watt-hours, about what an oven would use in a little over one second, or a high-efficiency lightbulb would use in a couple of minutes," Altman wrote. "It also uses about 0.000085 gallons of water; roughly one-fifteenth of a teaspoon," he continued. Altman wrote that he expects energy to "become wildly abundant" in the 2030s. Energy, along with the limitations of human intelligence, have been "fundamental limiters on human progress for a long time," Altman added. "As data center production gets automated, the cost of intelligence should eventually converge to near the cost of electricity," he wrote. OpenAI did not respond to a request for comment from Business Insider. This is not the first time Altman has predicted that AI will become cheaper to use. In February, Altman wrote on his blog that the cost of using AI will drop by 10 times every year. "You can see this in the token cost from GPT-4 in early 2023 to GPT-4o in mid-2024, where the price per token dropped about 150x in that time period," Altman wrote. "Moore's law changed the world at 2x every 18 months; this is unbelievably stronger," he added. Tech companies hoping to dominate in AI have been considering using nuclear energy to power their data centers. In September, Microsoft signed a 20-year deal with Constellation Energy to reactivate one of the dormant nuclear plants located in Three Mile Island. In October, Google said it had struck a deal with Kairos Power, a nuclear energy company, to make three small modular nuclear reactors. The reactors, which will provide up to 500 megawatts of electricity, are set to be ready by 2035. Google's CEO, Sundar Pichai, said in an interview with Nikkei Asia published in October that the search giant wants to achieve net-zero emissions across its operations by 2030. He added that besides looking at nuclear energy, Google was considering solar energy.