logo
#

Latest news with #PreparednessFramework

OpenAI says it could ‘adjust' AI model safeguards if a competitor makes their AI high-risk
OpenAI says it could ‘adjust' AI model safeguards if a competitor makes their AI high-risk

Euronews

time18-04-2025

  • Business
  • Euronews

OpenAI says it could ‘adjust' AI model safeguards if a competitor makes their AI high-risk

ADVERTISEMENT OpenAI said that it will consider adjusting its safety requirements if a competing company releases a high-risk artificial intelligence model without protections. OpenAI wrote in its Preparedness Framework report that if another company releases a model that poses a threat, it could do the same after 'rigorously' confirming that the 'risk landscape' has changed. The document explains how the company tracks, evaluates, forecasts and protects against catastrophic risks posed by AI models. 'If another frontier AI developer releases a high-risk system without comparable safeguards, we may adjust our requirements,' OpenAI wrote in a blog post published on Tuesday. Related OpenAI could be launching a social media platform to compete with Elon Musk's X - reports 'However, we would first rigorously confirm that the risk landscape has actually changed, publicly acknowledge that we are making an adjustment, assess that the adjustment does not meaningfully increase the overall risk of severe harm, and still keep safeguards at a level more protective'. Before releasing a model to the general public, OpenAI evaluates whether it could cause severe harm by identifying plausible, measurable, new, severe and irremediable risks, and building safeguards against them. It then classifies these risks as either low, medium, high or critical. Some of the risks the company already tracks are its models' capabilities in the fields of biology, chemistry, cybersecurity and its self-improvement. The company said it's also evaluating new risks, such as whether their AI model can perform for a long time without human involvement, self-replication and what threat it could pose in the nuclear and radiological fields. 'Persuasion risks,' such as how ChatGPT is used for political campaigning or lobbying will be handled outside of the framework and will instead be looked at through the Model Spec , the document that determines ChatGPT's behaviour. 'Quietly reducing safety commitments' Steven Adler, a former OpenAI researcher, said on X that the updates to the company's preparedness report show that it is 'quietly reducing its safety commitments'. In his post, he pointed to a December 2023 commitment by the company to test 'fine-tuned versions' of their AI models, but noted that OpenAI will now be shifting to only testing models whose trained parameters or 'weights' will be released. 'People can totally disagree about whether testing finetuned models is needed, and better for OpenAI to remove a commitment than to keep it and just not follow,' he said. Related OpenAI to release new 'open' language model in coming months 'But in either case, I'd like OpenAI to be clearer about having backed off this previous commitment'. The news comes after OpenAI released a new family of AI models, called GPT-4.1 this week, reportedly without a system card or safety report. Euronews Next has asked OpenAI about the safety report but did not receive a reply at the time of publication. The news comes after 12 former OpenAI employees filed a brief last week in Elon Musk's case brought against OpenAI, which alleges that a shift to a for-profit company could lead to corners being cut on safety.

OpenAI updated its safety framework—but no longer sees mass manipulation and disinformation as a critical risk
OpenAI updated its safety framework—but no longer sees mass manipulation and disinformation as a critical risk

Yahoo

time16-04-2025

  • Business
  • Yahoo

OpenAI updated its safety framework—but no longer sees mass manipulation and disinformation as a critical risk

OpenAI said it will stop assessing its AI models prior to releasing them for the risk that they could persuade or manipulate people, possibly helping to swing elections or create highly effective propaganda campaigns. The company said it would now address those risks through its terms of service, restricting the use of its AI models in political campaigns and lobbying, and monitoring how people are using the models once they are released for signs of violations. OpenAI also said it would consider releasing AI models that it judged to be 'high risk' as long as it has taken appropriate steps to reduce those dangers—and would even consider releasing a model that presented what it called 'critical risk' if a rival AI lab had already released a similar model. Previously, OpenAI had said it would not release any AI model that presented more than a 'medium risk.' The changes in policy were laid out in an update to OpenAI's 'Preparedness Framework' yesterday. That framework details how the company monitors the AI models it is building for potentially catastrophic dangers—everything from the possibility the models will help someone create a biological weapon to their ability to assist hackers to the possibility that the models will self-improve and escape human control. The policy changes split AI safety and security experts. Several took to social media to commend OpenAI for voluntarily releasing the updated framework, noting improvements such as clearer risk categories and a stronger emphasis on emerging threats like autonomous replication and safeguard evasion. However, others voiced concerns, including Steven Adler, a former OpenAI safety researcher who criticized the fact that the updated framework no longer requires safety tests of fine-tuned models. 'OpenAI is quietly reducing its safety commitments,' he wrote on X. Still, he emphasized that he appreciated OpenAI's efforts: 'I'm overall happy to see the Preparedness Framework updated,' he said. 'This was likely a lot of work, and wasn't strictly required.' Some critics highlighted the removal of persuasion from the dangers the Preparedness Framework addresses. 'OpenAI appears to be shifting its approach,' said Shyam Krishna, a research leader in AI policy and governance at RAND Europe. 'Instead of treating persuasion as a core risk category, it may now be addressed either as a higher-level societal and regulatory issue or integrated into OpenAI's existing guidelines on model development and usage restrictions.' It remains to be seen how this will play out in areas like politics, he added, where AI's persuasive capabilities are 'still a contested issue.' Courtney Radsch, a senior fellow at Brookings, the Center for International Governance Innovation, and the Center for Democracy and Technology working on AI ethics went further, calling the framework in a message to Fortune 'another example of the technology sector's hubris." She emphasized that the decision to downgrade 'persuasion' 'ignores context – for example, persuasion may be existentially dangerous to individuals such as children or those with low AI literacy or in authoritarian states and societies.' Oren Etzioni, former CEO of the Allen Institute for AI and founder of TrueMedia, which offers tools to fight AI-manipulated content, also expressed concern. 'Downgrading deception strikes me as a mistake given the increasing persuasive power of LLMs,' he said in an email. 'One has to wonder whether OpenAI is simply focused on chasing revenues with minimal regard for societal impact.' However, one AI safety researcher not affiliated with OpenAI told Fortune that it seems reasonable to simply address any risks from disinformation or other malicious persuasion uses through OpenAI's terms of service. The researcher, who asked to remain anonymous because he is not permitted to speak publicly without authorization from his current employer, added that persuasion/manipulation risk is difficult to evaluate in pre-deployment testing. In addition, he pointed out that this category of risk is more amorphous and ambivalent compared to other critical risks, such as the risk AI will help someone perpetrate a chemical or biological weapons attack or will help someone in a cyberattack. It is notable that some Members of the European Parliament have also voiced concern that the latest draft of the proposed code of practice for complying with the EU AI Act also downgraded mandatory testing of AI models for the possibility that they could spread disinformation and undermine democracy to a voluntary consideration. Studies have found AI chatbots to be highly persuasive, although this capability itself is not necessarily dangerous. Researchers at Cornell University and MIT, for instance, found that dialogues with chatbots were effective at getting people question conspiracy theories. Another criticism of OpenAI's updated framework centered on a line where OpenAI states: 'If another frontier AI developer releases a high-risk system without comparable safeguards, we may adjust our requirements.' 'They're basically signaling that none of what they say about AI safety is carved in stone,' said longtime OpenAI critic Gary Marcus in a LinkedIn message, who said the line forewarns a race to the bottom. 'What really governs their decisions is competitive pressure—not safety. Little by little, they've been eroding everything they once promised. And with their proposed new social media platform, they're signaling a shift toward becoming a for-profit surveillance company selling private data—rather than a nonprofit focused on benefiting humanity.' Overall, it is useful that companies like OpenAI are sharing their thinking around their risk management practices openly, Miranda Bogen, director of the AI governance lab at the Center for Democracy & Technology, told Fortune in an email. That said, she added she is concerned about moving the goalposts. 'It would be a troubling trend if, just as AI systems seem to be inching up on particular risks, those risks themselves get deprioritized within the guidelines companies are setting for themselves,' she said. She also criticized the framework's focus on 'frontier' models when OpenAI and other companies have used technical definitions of that term as an excuse to not publish safety evaluations of recent, powerful models.(For example, OpenAI released its 4.1 model yesterday without a safety report, saying that it was not a frontier model). In other cases, companies have either failed to publish safety reports or been slow to do so, publishing them months after the model has been released. 'Between these sorts of issues and an emerging pattern among AI developers where new models are being launched well before or entirely without the documentation that companies themselves promised to release, it's clear that voluntary commitments only go so far,' she said. This story was originally featured on

OpenAI says it may 'adjust' its safety requirements if a rival lab releases 'high-risk' AI
OpenAI says it may 'adjust' its safety requirements if a rival lab releases 'high-risk' AI

Yahoo

time15-04-2025

  • Business
  • Yahoo

OpenAI says it may 'adjust' its safety requirements if a rival lab releases 'high-risk' AI

In an update to its Preparedness Framework, the internal framework OpenAI uses to decide whether AI models are safe and what safeguards, if any, are needed during development and release, OpenAI said that it may "adjust" its requirements if a rival AI lab releases a "high-risk" system without comparable safeguards. The change reflects the increasing competitive pressures on commercial AI developers to deploy models quickly. OpenAI has been accused of lowering safety standards in favor of faster releases, and of failing to deliver timely reports detailing its safety testing. Perhaps anticipating criticism, OpenAI claims that it wouldn't make these policy adjustments lightly, and that it would keep its safeguards at "a level more protective." "If another frontier AI developer releases a high-risk system without comparable safeguards, we may adjust our requirements," wrote OpenAI in a blog post published Tuesday afternoon. "However, we would first rigorously confirm that the risk landscape has actually changed, publicly acknowledge that we are making an adjustment, assess that the adjustment does not meaningfully increase the overall risk of severe harm, and still keep safeguards at a level more protective." The refreshed Preparedness Framework also makes clear that OpenAI is relying more heavily on automated evaluations to speed up product development. The company says that, while it hasn't abandoned human-led testing altogether, it has built "a growing suite of automated evaluations" that can "keep up with [a] faster [model release] cadence." According to the Financial Times, OpenAI gave testers less than a week for safety checks for an upcoming major model — a compressed timeline compared to previous releases. The publication's sources also alleged that many of OpenAI's safety tests are now conducted on earlier versions of models than the versions released to the public. Other changes to OpenAI's framework pertain to how the company categorizes models according to risk, including models that can conceal their capabilities, evade safeguards, prevent their own shutdown, and even self-replicate. OpenAI says that it'll now focus on whether models meet one of two thresholds: "high" capability or "critical" capability. OpenAI's definition of the former is a model that could "amplify existing pathways to severe harm." The latter are models that "introduce unprecedented new pathways to severe harm," per the company. "Covered systems that reach high capability must have safeguards that sufficiently minimize the associated risk of severe harm before they are deployed," wrote OpenAI in its blog post. "Systems that reach critical capability also require safeguards that sufficiently minimize associated risks during development." The changes are the first OpenAI has made to the Preparedness Framework since 2023. This article originally appeared on TechCrunch at Sign in to access your portfolio

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store