OpenAI's ex-policy lead accuses the company of ‘rewriting' its AI safety history
OpenAI's former policy lead Miles Brundage, has accused the company of "rewriting history" around how it approached the launch of its GPT-2 model in 2019. He argues that a new OpenAI blog post on AI Safety subtly tips the company in the direction of releasing AI models unless there is incontrovertible evidence that they present an immediate danger.
A former policy lead at OpenAI is accusing the company of rewriting its history with a new post about the company's approach to safety and alignment.
Miles Brundage, the former head of policy research at OpenAI, criticized a recent post published by the company titled "How we think about safety and alignment."
In it, the company described the road to artificial general intelligence (AGI)—an AI system that can perform all the cognitive tasks as well or better than a person—as a continuous evolution, rather than a sudden leap. It also emphasized the value of "iterative deployment," which involves releasing AI systems, learning from how users interact with them, and then refining safety measures based on this evidence.
While Brundage praised the "bulk" of the post, he criticized the company for rewriting the "history of GPT-2 in a concerning way."
GPT-2, released in February 2019, was the second iteration of OpenAI's flagship large language model. At the time, it represented a much larger and more capable model than its successor, GPT-1, and was trained on a much broader dataset. But compared to subsequent GPT models, particularly GPT-3.5, the model that powered ChatGPT, GPT-2 was not particularly capable. It could write poetry and several coherent paragraphs of prose, but ask it to generate more than that, and its outputs often descended into strange nonsequiturs or gibberish. It was particularly good at answering factual questions or summarization or coding, or most of the tasks that people are now addressing using LLMs.
Nonetheless, OpenAI initially withheld GPT-2's full release and source code, citing concerns about the potential for dangerous misuse of the model. Instead, it gave a select number of news outlets limited access to a demo version of the model.
At the time, critics, including many AI researchers in academia, argued OpenAI's claims that the model presented a significantly increased risk of misuse were overblown or disingenuous. Some questioned whether OpenAI's claims were a publicity stunt—an underhanded way of hyping the unreleased model's capabilities and of ensuring that OpenAI's announcement were generate lots of headlines.
One AI-focused publication even penned an open letter urging OpenAI to release GPT-2, arguing its importance outweighed the risks. Eventually, OpenAI rolled out a partial version, followed by a full release months later.
In its recent safety post, OpenAI said the company didn't release GPT-2 due to "concerns about malicious applications." But it then essentially argued that some of OpenAI's former critics had been right and that the company's concerns about misuse had proved overblown and unnecessary. And it tried to argue that some of that excess of concern came from the fact that many of the company's AI safety researchers and policy staff assumed AGI would emerge suddenly, with one model suddenly leaping over the threshold to human-like intelligence, instead of emerging gradually.
"In a discontinuous world, practicing for the AGI moment is the only thing we can do, and safety lessons come from treating the systems of today with outsized caution relative to their apparent power. This is the approach we took for GPT‑2," OpenAI wrote.
However, Brundage, who was at the company when the model was released and was intimately involved with discussion about how the company would handle its release, argued that GPT-2's launch "was 100% consistent + foreshadowed OpenAI's current philosophy of iterative deployment."
'The model was released incrementally, with lessons shared at each step. Many security experts at the time thanked us for this caution," Brundage wrote on X.
He dismissed the idea that OpenAI's caution with GPT-2 was unnecessary or based on outdated assumptions about AGI. 'What part of that was motivated by or premised on thinking of AGI as discontinuous? None of it,' he wrote.
Brundage argued that the post's revisionist history serves to subtly bias the company in the direction of dismissing the concerns of AI safety researchers and releasing AI models, unless there is incontrovertible evidence that they present an immediate danger.
"It feels as if there is a burden of proof being set up in this section where concerns are alarmist + you need overwhelming evidence of imminent dangers to act on them - otherwise, just keep shipping," he said. "That is a very dangerous mentality for advanced AI systems."
"If I were still working at OpenAI, I would be asking why this blog post was written the way it was, and what exactly OpenAI hopes to achieve by poo-pooing caution in such a lop-sided way," he wrote.
OpenAI's blog post introduced two new ideas: the importance of iterative deployment and a slightly different approach to testing it's AI models.
Robert Trager, the co-director of the Oxford Martin AI Governance Initiative, told Fortune that the company appeared to be distancing itself from relying heavily on theory when testing it's models.
"It was like they were saying, we're not going to rely on math proving that the system is safe. We're going to rely on testing the system in a secure environment," he said.
"It makes sense to rely on all the tools that we have," he added. "So it's strange to say we're not going to rely so much on that tool."
Trager also said that iterative deployment works best when models are being deployed very often with minor changes between each release. However, he noted that this kind of approach may not be practical for OpenAI as some systems could be significantly different from what was deployed in the past.
"Their argument that there really won't be much of an impact, or a differential impact, from one system to the next, it doesn't seem quite to be convincing," he said.
Hamza Chaudhry, the AI and National Security lead at the Future of Life Institute, a non-profit that has raised concerns about AI's potential risk to humanity, also said that "relying on gradual rollouts may mean that potentially harmful capabilities and behaviors are exposed to the real world before being fully mitigated."
OpenAI also did not mention "staged deployment" in its blog post, which generally means releasing a model in various stages and evaluating it along the way. For example, allowing a small group of internal testers to access an AI model and accessing the results before releasing it to a larger set of users.
"The impression that it makes is that they're offering potential future justifications for actions that aren't necessarily consistent with what their safety standards have been in the past. And I would say that overall, they haven't made the case that new standards are better than earlier standards," Trager said.
Chowdhry said that OpenAI's approach to safety amounted "reckless experimenting on the public"—something that would not be allowed in any other industry. He also said this was "part and parcel of a broader push from OpenAI to minimize real government oversight over advanced high-stakes AI systems."
The post has been criticized by other prominent figures in the industry. Gary Marcus, professor emeritus of psychology and neural science at New York University, told Fortune the blog felt like "marketing" rather than an attempt to explain any new safety approaches.
"It's a way to hype AGI," he said. "And it's an excuse to dump stuff in the real world rather than properly sandboxing it before releasing and making sure it is actually ok. The blog is certainly not an actual solution to the many challenges of AI safety."
Over the past year, OpenAI has faced criticism from some AI experts for prioritizing product development over safety.
Several former OpenAI employees have quit over internal AI safety disputes, including prominent AI researcher Jan Leike.
Leike left with the company last year at the same time as OpenAI co-founder Ilya Sutskever. He openly blamed the lack of safety prioritization at the company for his departure, claiming that over the past few years, 'safety culture and processes have taken a backseat to shiny products.' Leike and Sutskever were co-leading the company's Superalignment team at the time, which was focused on the long-term risks of superpowerful artificial intelligence that would be more capable than all humanity. After the pair parted ways with the company, the team was dissolved.
Internally, employees said that OpenAI had failed to give safety teams the compute it had promised. In May last year a half-dozen sources familiar with the functioning of the Superalignment team told Fortune that OpenAI never fulfilled an earlier commitment to provide the safety team with 20% of its computing power.
The internal disagreements over AI safety have also resulted in an exodus of safety-focused employees. Daniel Kokotajlo, a former OpenAI governance researcher, told Fortune in August that nearly half of the company's staff that once focused on the long-term risks of superpowerful AI had left the company.
Marcus said that OpenAI had failed to live up to its purported principles and "instead they have repeatedly prioritized profit over safety (which is presumably part of why so many safety-conscious employees left)."
"For years, OpenAI has been pursuing a "black box" technology that probably can't ever be properly aligned, and done little to seriously consider alternative, more transparent technologies that might be less short-term profitable but safer for humanity in the long run," he said.
Representatives for OpenAI did not respond to Fortune's request for comment. Brundage declined to provide further comments.
This story was originally featured on Fortune.com

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles

Business Insider
an hour ago
- Business Insider
AI leaders have a new term for the fact that their models are not always so intelligent
As academics, independent developers, and the biggest tech companies in the world drive us closer to artificial general intelligence — a still hypothetical form of intelligence that matches human capabilities — they've hit some roadblocks. Many emerging models are prone to hallucinating, misinformation, and simple errors. Google CEO Sundar Pichai referred to this phase of AI as AJI, or "artificial jagged intelligence," on a recent episode of Lex Fridman's podcast. "I don't know who used it first, maybe Karpathy did," Pichai said, referring to deep learning and computer vision specialist Andrej Karpathy, who cofounded OpenAI before leaving last year. AJI is a bit of a metaphor for the trajectory of AI development — jagged, marked at once by sparks of genius and basic mistakes. In a 2024 X post titled "Jagged Intelligence," Karpathy described the term as a "word I came up with to describe the (strange, unintuitive) fact that state of the art LLMs can both perform extremely impressive tasks (e.g. solve complex math problems) while simultaneously struggle with some very dumb problems." He then posted examples of state of the art large language models failing to understand that 9.9 is bigger than 9.11, making "non-sensical decisions" in a game of tic-tac-toe, and struggling to count. The issue is that unlike humans, "where a lot of knowledge and problem-solving capabilities are all highly correlated and improve linearly all together, from birth to adulthood," the jagged edges of AI are not always clear or predictable, Karpathy said. Pichai echoed the idea. "You see what they can do and then you can trivially find they make numerical errors or counting R's in strawberry or something, which seems to trip up most models," Pichai said. "I feel like we are in the AJI phase where dramatic progress, some things don't work well, but overall, you're seeing lots of progress." In 2010, when Google DeepMind launched, its team would talk about a 20-year timeline for AGI, Pichai said. Google subsequently acquired DeepMind in 2014. Pichai thinks it'll take a little longer than that, but by 2030, "I would stress it doesn't matter what that definition is because you will have mind-blowing progress on many dimensions." By then the world will also need a clear system for labeling AI-generated content to "distinguish reality," he said. "Progress" is a vague term, but Pichai has spoken at length about the benefits we'll see from AI development. At the UN's Summit of the Future in September 2024, he outlined four specific ways that AI would advance humanity — improving access to knowledge in native languages, accelerating scientific discovery, mitigating climate disaster, and contributing to economic progress.
Yahoo
6 hours ago
- Yahoo
Is CoreWeave Stock a Buy Now?
New AI stock CoreWeave had its initial public offering in March 2025. High demand for AI computing power led to CoreWeave's first-quarter sales soaring more than 400% year over year. The company anticipates sustained revenue growth, but CoreWeave faces financial risks, including operating at a loss. 10 stocks we like better than CoreWeave › Investing in today's stock market can be tricky given the volatile macroeconomic climate, fueled by the Trump administration's ever-shifting tariff policies. But the artificial intelligence sector remains a robust investment opportunity as organizations around the world race to build artificial intelligence (AI) capabilities. Consequently, AI stocks provide the potential for great gains. One example is CoreWeave (NASDAQ: CRWV). The company went public in March at $40 per share. Since then, CoreWeave stock soared to a 52-week high of $166.63 in June. This hot stock remains more than triple its IPO price at the time of this writing. Can it go higher? Evaluating whether now is the time to grab CoreWeave shares requires digging into the company and unpacking its potential as a good investment for the long haul. CoreWeave delivers cloud computing infrastructure to businesses hungry for more computing capacity for their AI systems. The company operates over 30 data centers housing servers and other hardware used by customers to train their AI and develop inference, which is an AI's ability to apply what it learned in training to real-world situations. AI juggernauts such as Microsoft, IBM, and OpenAI, the owner of ChatGPT, are among its roster of customers. The insatiable appetite for AI computing power propelled CoreWeave's business. The company's first-quarter revenue rose a whopping 420% year over year to $981.6 million. Sales growth shows no sign of slowing down. CoreWeave expects Q2 revenue to reach about $1.1 billion. That would represent a strong year-over-year increase of nearly 170% from the prior year's $395 million. The company signs long-term, committed contracts, and as a result, it has visibility into its future revenue potential. At the end of Q1, CoreWeave had amassed a revenue backlog of $25.9 billion, up 63% year over year thanks to a deal with OpenAI. The company forecasts 2025 full-year revenue to come in between $4.9 billion and $5.1 billion, a substantial jump up from 2024's $1.9 billion. Although CoreWeave has enjoyed massive sales success, there are some potential pitfalls with the company. For starters, it isn't profitable. Its Q1 operating expenses totaled $1 billion compared to revenue of $981.6 million, resulting in an operating loss of $27.5 million. Even worse, its costs are accelerating faster than sales, which means the company is moving further away from reaching profitability. CoreWeave's $1 billion in operating expenses represented a 487% increase over the prior year, eclipsing its 420% year-over-year revenue growth. Another area of concern is the company's significant debt load. CoreWeave exited Q1 with $18.8 billion in total liabilities on its balance sheet, and $8.7 billion of that was debt. To keep up with customer demand for computing power, CoreWeave has to spend on expanding and upgrading AI-optimized hardware, and that's not cheap. As it adds customers, the company must expand its data centers to keep pace. Debt is one way it's funding these capital expenditures. Among the risks of buying its stock, CoreWeave admitted, "Our substantial indebtedness could materially adversely affect our financial condition" and that the company "may still incur substantially more indebtedness in the future." In fact, its Q1 debt total of $8.7 billion was a 10% increase from the prior quarter's $7.9 billion in debt. Seeing an increase in both expenses and debt is a concern, but because CoreWeave is a newly public company, there's not much history to know how well it can manage its finances over the long term. Q1 is the only quarter of financial results it's released since its initial public offering. If subsequent quarters reveal a trend toward getting costs and debt under control while continuing to show strong sales growth, CoreWeave stock may prove to be a worthwhile investment over the long run. But for now, only investors with a high risk tolerance should consider buying shares. Even then, another consideration is CoreWeave's stock valuation. This can be assessed by comparing its price-to-sales (P/S) ratio to other AI companies, such as its customer and fellow cloud provider Microsoft and AI leader Nvidia. CoreWeave's share price surged over recent weeks, causing its P/S multiple to skyrocket past that of Nvidia and Microsoft. The valuation suggests CoreWeave stock is overpriced at this time. Although CoreWeave's sales are strong, given its pricey stock and shaky financials, the ideal approach is to put CoreWeave on your watch list. See how it performs over the next few quarters, and wait for its high valuation to drop before considering an investment. Before you buy stock in CoreWeave, consider this: The Motley Fool Stock Advisor analyst team just identified what they believe are the for investors to buy now… and CoreWeave wasn't one of them. The 10 stocks that made the cut could produce monster returns in the coming years. Consider when Netflix made this list on December 17, 2004... if you invested $1,000 at the time of our recommendation, you'd have $669,517!* Or when Nvidia made this list on April 15, 2005... if you invested $1,000 at the time of our recommendation, you'd have $868,615!* Now, it's worth noting Stock Advisor's total average return is 792% — a market-crushing outperformance compared to 171% for the S&P 500. Don't miss out on the latest top 10 list, available when you join . See the 10 stocks » *Stock Advisor returns as of June 2, 2025 Robert Izquierdo has positions in International Business Machines, Microsoft, and Nvidia. The Motley Fool has positions in and recommends International Business Machines, Microsoft, and Nvidia. The Motley Fool recommends the following options: long January 2026 $395 calls on Microsoft and short January 2026 $405 calls on Microsoft. The Motley Fool has a disclosure policy. Is CoreWeave Stock a Buy Now? was originally published by The Motley Fool Sign in to access your portfolio

Business Insider
6 hours ago
- Business Insider
AI leaders have a new term for the fact that their models are not always so intelligent
Progress is rarely linear, and AI is no exception. As academics, independent developers, and the biggest tech companies in the world drive us closer to artificial general intelligence — a still hypothetical form of intelligence that matches human capabilities — they've hit some roadblocks. Many emerging models are prone to hallucinating, misinformation, and simple errors. Google CEO Sundar Pichai referred to this phase of AI as AJI, or "artificial jagged intelligence," on a recent episode of Lex Fridman's podcast. "I don't know who used it first, maybe Karpathy did," Pichai said, referring to deep learning and computer vision specialist Andrej Karpathy, who cofounded OpenAI before leaving last year. AJI is a bit of a metaphor for the trajectory of AI development — jagged, marked at once by sparks of genius and basic mistakes. "You see what they can do and then you can trivially find they make numerical errors or counting R's in strawberry or something, which seems to trip up most models," Pichai said. "I feel like we are in the AJI phase where dramatic progress, some things don't work well, but overall, you're seeing lots of progress." In 2010, when Google DeepMind launched, its team would talk about a 20-year timeline for AGI, Pichai said. Google subsequently acquired DeepMind in 2014. Pichai thinks it'll take a little longer than that, but by 2030, "I would stress it doesn't matter what that definition is because you will have mind-blowing progress on many dimensions." By then the world will also need a clear system for labeling AI-generated content to "distinguish reality," he said. "Progress" is a vague term, but Pichai has spoken at length about the benefits we'll see from AI development. At the UN's Summit of the Future in September 2024, he outlined four specific ways that AI would advance humanity — improving access to knowledge in native languages, accelerating scientific discovery, mitigating climate disaster, and contributing to economic progress.