OpenAI says it could ‘adjust' AI model safeguards if a competitor makes their AI high-risk

18-04-2025

ADVERTISEMENT
OpenAI said that it will consider adjusting its safety requirements if a competing company releases a high-risk artificial intelligence model without protections.
OpenAI wrote in its
Preparedness Framework
report that if another company releases a model that poses a threat, it could do the same after 'rigorously' confirming that the 'risk landscape' has changed.
The document explains how the company tracks, evaluates, forecasts and protects against catastrophic risks posed by AI models.
'If another frontier AI developer releases a high-risk system without comparable safeguards, we may adjust our requirements,' OpenAI wrote in a blog post published on Tuesday.
Related
OpenAI could be launching a social media platform to compete with Elon Musk's X - reports
'However, we would first rigorously confirm that the risk landscape has actually changed, publicly acknowledge that we are making an adjustment, assess that the adjustment does not meaningfully increase the overall risk of severe harm, and still keep safeguards at a level more protective'.
Before releasing a model to the general public, OpenAI evaluates whether it could cause severe harm by identifying plausible, measurable, new, severe and irremediable risks, and building safeguards against them. It then classifies these risks as either low, medium, high or critical.
Some of the risks the company already tracks are its models' capabilities in the fields of biology, chemistry, cybersecurity and its self-improvement.
The company said it's also evaluating new risks, such as whether their AI model can perform for a long time without human involvement, self-replication and what threat it could pose in the nuclear and radiological fields.
'Persuasion risks,' such as how ChatGPT is used for political campaigning or lobbying will be handled outside of the framework and will instead be looked at through the
Model Spec
, the document that determines ChatGPT's behaviour.
'Quietly reducing safety commitments'
Steven Adler, a former OpenAI researcher, said on
X
that the updates to the company's preparedness report show that it is 'quietly reducing its safety commitments'.
In his post, he pointed to a December 2023 commitment by the company to test 'fine-tuned versions' of their AI models, but noted that OpenAI will now be shifting to only testing models whose trained parameters or 'weights' will be released.
'People can totally disagree about whether testing finetuned models is needed, and better for OpenAI to remove a commitment than to keep it and just not follow,' he said.
Related
OpenAI to release new 'open' language model in coming months
'But in either case, I'd like OpenAI to be clearer about having backed off this previous commitment'.
The news comes after OpenAI released a new family of AI models, called GPT-4.1 this week, reportedly without a
system card
or safety report. Euronews Next has asked OpenAI about the safety report but did not receive a reply at the time of publication.
The news comes after 12 former OpenAI employees filed a
brief
last week in Elon Musk's case brought against OpenAI, which alleges that a shift to a for-profit company could lead to corners being cut on safety.

Hashtags

Business

Finance

#ChatGPT

#PreparednessFramework

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Why are social media sites betting on crowdsourced fact-checking?

Euronews

a day ago

Euronews

Why are social media sites betting on crowdsourced fact-checking?

TikTok is the latest social media platform to launch a crowdsourced fact-checking feature. The short-form video app is rolling out the feature, called Footnotes, first in the United States. It lets users write a note with more context on a video and vote on whether other comments should appear under a video. A footnote could share a researcher's view on a 'complex STEM-related topic' or highlight new statistics to give a fuller picture on an ongoing event, the company said. The new feature is similar to other community-based fact-checking features on social media platforms such as X and Meta's Facebook or Instagram. But why are social media giants moving towards this new system to fact-check online claims? What is community fact-checking? Scott Hale, an associate professor at the Oxford Internet Institute, said that Twitter, now X, started the charge to community notes in 2021 with a feature called Birdwatch. The experiment carried on after Elon Musk took control of the company in 2022. Otavio Vinhas, a researcher at the National Institute of Science and Technology in Informational Disputes and Sovereignties in Brazil, said that Meta's introduction of a community notes programme earlier this year is in line with a trend led by US President Donald Trump to move towards a more libertarian view of free speech on social media. 'The demand is that platforms should commit to this [libertarian view],' Vinhas told Euronews Next. 'For them, fair moderation would be moderation that prioritises free speech without much concern to the potential harm or the potential false claims it can push up'. Hale told Euronews Next there is some scientific proof behind crowdsourcing, with studies showing that crowds could often arrive at the right verdict when evaluating whether information was well fact-checked or not. They often agreed with professionals, he said. But TikTok's Footnotes is slightly different than other crowdsourcing initiatives on Meta or X, Vinhas said. That's because the programme still asks users to add the source of information for their note, which Vinhas says is not mandatory on X. Most notes don't end up on the platforms Where the challenge lies for all social media companies is getting the right people to see the notes, Hale said. All three community programmes use a bridge-based ranking system that ranks how similar you are to another user based on the content that a user consumes, based either on the other accounts they follow or the videos they watch, Hale said. The algorithm shows the content to users that are considered 'dissimilar' to each other to see if they both find the note helpful, Hale said. Notes that pass the test will then be visible on the platform. What tends to happen, though, is that the vast majority of the notes that are written on the platform are actually never seen, Vinhas said. A June study from the Digital Democracy Institute of the Americas (DDIA) of English and Spanish community notes on X found that over 90 per cent of the 1.7 million notes available on a public database never made it online. Notes that did make it to the platform took an average of 14 days to be published, down from 100 days in 2022, even though there are still delays to how quickly X responds to these notes, the DDIA report continued. 'I don't think these platforms can achieve the promise of bringing consensus and make the internet this marketplace of ideas in which the best information and the best ideas end up winning the argument,' Vinhas said. Hale said it can be difficult for users to come across notes that might contradict their point of view because of 'echo chambers' on social media, where content is shown that reinforces the beliefs that users already have. 'It's very easy to get ourselves into parts of networks that are similar to us,' he said. One way to improve the efficiency of community notes would be to gamify them, Hale continued. He suggested the platforms could follow Wikipedia's example, where contributing users have their own page with their edits. The platform also offers a host of service awards to editors based on the value of their contributions and the length of their service, and lets them take part in contests and fundraisers. What else do social media sites do to moderate content on their platforms? Community fact-checking is not the only method that social media companies use to limit the spread of mis- or disinformation on their platforms, Hale and Vinhas said. Meta, X, and TikTok all use some degree of automated moderation to distinguish potentially harmful or violent content. Over at Meta, the company said it relies on artificial intelligence (AI) systems to scan content proactively and immediately remove it if it matches known violations of its community standards or code of conduct. When that content is flagged, human moderators review individual posts to see if the content actually breaches the code or if some context is missing. Hale said that it can be difficult for automated systems to flag new problematic content because it recognises the repeated claims of misinformation that it has been trained on, meaning new lies can slip through the cracks. Users themselves can also report to the platforms when there is a piece of content that may violate community standards, Hale said. However, Meta said that community notes would replace relationships with conventional fact-checkers, who flagged and labeled misinformation for almost a decade in the United States. So far, there are no signs that the platform will end these partnerships in the United Kingdom and the European Union, media reports suggest. Hale and Vinhas said professional fact-checking and community notes can actually complement one another if done properly. In that case, platforms would have an engaged community of people adding context in notes as well as the rigor of professional fact-checkers who can take additional steps, such as calling experts or going straight to a source to verify whether something is true or not, Hale added. Professional fact-checkers often have context as well to the political, social, and economic pulse of the countries where disinformation campaigns may be playing out, Vinhas said. 'Fact-checkers will be actively monitoring [a] political crisis on a 24-7 basis almost, while users may not be as much committed to information integrity,' he said. For now, Vinhas said TikTok's model is encouraging because it is being used to contribute to a 'global fact-checking programme,' but he said there's no indication whether this will continue to be the case.

ChatGPT: OpenAI launches GPT-5, the latest version of its language processing tool

LeMonde

3 days ago

LeMonde

ChatGPT: OpenAI launches GPT-5, the latest version of its language processing tool

"GPT-3 sort of felt like talking to a high school student. (...) GPT-4 felt like you're talking to a college student. GPT-5 is the first time that it really feels like talking to a PhD-level expert." This is how Sam Altman, the CEO of OpenAI, introduced the latest version of the language processing software that powers the ChatGPT chatbot. Launched on Thursday, August 7, GPT-5 was made available to all of the chatbot's users. It was, unsurprisingly, designed to be faster and more accurate, even for the most complex questions. "I tried going back to GPT-4 and it was quite miserable," Altman said, in a press conference he held the day before. GPT-5 is clearly superior, he said, adding: "It reminds me of when the iPhone went from those giant-pixel old ones to the retina display, and then I went back to using one of those big pixelated things, and I was like, 'Wow, I can't believe how bad we had it.'" This version, the California-based company stated, will make fewer mistakes and generate fewer "hallucinations." GPT-4 would often fabricate responses when it lacked sufficient information, but, for GPT-5, they had "trained the model to be honest," said Alex Beutel, safety research lead, in the presentation. OpenAI claimed the model was now able to inform users about its limitations. They also said it would be better at producing computer code: OpenAI demonstrated how, using a prompt describing a planned website, which featured a quiz and a mini-game, the software generated 600 lines of code in a matter of seconds, and allowed users to view and test the result.

OpenAI releases ChatGPT-5 as AI race accelerates

France 24

3 days ago

France 24

OpenAI releases ChatGPT-5 as AI race accelerates

ChatGPT-5 is rolling out free to all users of the AI tool, which is used by nearly 700 million people weekly, OpenAI said in a briefing with journalists. Co-founder and chief executive Sam Altman touted this latest iteration as "clearly a model that is generally intelligent." "It is a significant step toward models that are really capable," he said. Altman cautioned that there is still work to be done to achieve the kind of artificial general intelligence (AGI) that thinks the way people do. "This is not a model that continuously learns as it is deployed from new things it finds, which is something that, to me, feels like it should be part of an AGI," Altman said. "But the level of capability here is a huge improvement." GPT-5 is particularly adept when it comes to AI acting as an "agent" independently tending to computer tasks, according to Michelle Pokrass of the development team. "GPT-3 felt to me like talking to a high school student -- ask a question, maybe you get a right answer, maybe you'll get something crazy," Altman said. "GPT-4 felt like you're talking to a college student; GPT five is the first time that it really feels like talking to a PhD-level expert in any topic." Vibe coding Altman said he expects the ability to create software programs on demand -- so-called "vibe-coding" -- to be a "defining part of the new ChatGPT-5 era." As an example, OpenAI executives demonstrated the bot being asked to create an app for learning the French language. With fierce competition around the world over the technology, Altman said ChatGPT-5 led the pack in coding, writing, health care and much more. Rivals including Google and Microsoft have been pumping billions of dollars into developing AI systems. Altman said there were "orders of magnitude more gains" to come on the path toward AGI. " have to invest in compute (power) at an eye watering rate to get that, but we intend to keep doing it." ChatGPT-5 was also trained to be trustworthy and stick to providing answers as helpful as possible without aiding a seemingly harmful mission, according to OpenAI safety research lead Alex Beutel. "We built evaluations to measure the prevalence of deception and trained the model to be honest," Beutel said. ChatGPT-5 is trained to generate "safe completions," sticking to high-level information that can't be used to cause harm, according to Beutel. The debut comes a day after OpenAI said it was allowing the US government to use a version of ChatGPT designed for businesses for a year for just $1. Federal workers in the executive branch will have access to ChatGPT Enterprise essentially free in a partnership with the US General Services Administration, according to the artificial intelligence sector star. The company this week also released two new AI models that can be downloaded for free and altered by users, to challenge similar offerings by US and Chinese competition. The release of gpt-oss-120b and gpt-oss-20b "open-weight language models" comes as the ChatGPT-maker is under pressure to share inner workings of its software in the spirit of its origin as a nonprofit.

OpenAI says it could ‘adjust' AI model safeguards if a competitor makes their AI high-risk

Hashtags

Try Our AI Features

Comments

Related Articles

Why are social media sites betting on crowdsourced fact-checking?

ChatGPT: OpenAI launches GPT-5, the latest version of its language processing tool

OpenAI releases ChatGPT-5 as AI race accelerates

Get Started Now: Download the App