Latest news with #SaferAI

Top AI Firms Fall Short on Safety, New Studies Find

Time Magazine

17-07-2025

Business
Time Magazine

Top AI Firms Fall Short on Safety, New Studies Find

The world's leading AI companies have 'unacceptable' levels of risk management, and a 'striking lack of commitment to many areas of safety,' according to two new studies published Thursday. The risks of even today's AI—by the admission of many top companies themselves—could include AI helping bad actors carry out cyberattacks or create bioweapons. Future AI models, top scientists worry, could escape human control altogether. The studies were carried out by the nonprofits SaferAI and the Future of Life Institute (FLI). Each was the second of its kind, in what the groups hope will be a running series that incentivizes top AI companies to improve their practices. 'We want to make it really easy for people to see who is not just talking the talk, but who is also walking the walk,' says Max Tegmark, president of the FLI. Read More: Some Top AI Labs Have 'Very Weak' Risk Management, Study Finds SaferAI assessed top AI companies' risk management protocols (also known as responsible scaling policies) to score each company on its approach to identifying and mitigating AI risks. No AI company scored better than 'weak' in SaferAI's assessment of their risk management maturity. The highest scorer was Anthropic (35%), followed by OpenAI (33%), Meta (22%), and Google DeepMind (20%). Elon Musk's xAI scored 18%. Two companies, Anthropic and Google DeepMind, received lower scores than the first time the study was carried out, in October 2024. The result means that OpenAI has overtaken Google as second place in SaferAI's ratings. Siméon Campos, founder of SaferAI, said Google scored comparatively low despite doing some good safety research, because the company makes few solid commitments in its policies. The company also released a frontier model earlier this year, Gemini 2.5, without sharing safety information—in what Campos called an 'egregious failure.' A spokesperson for Google DeepMind told TIME: 'We are committed to developing AI safely and securely to benefit society. AI safety measures encompass a wide spectrum of potential mitigations. These recent reports don't take into account all of Google DeepMind's AI safety efforts, nor all of the industry benchmarks. Our comprehensive approach to AI safety and security extends well beyond what's captured.' Anthropic's score also declined since SaferAI's last survey in October. This was due in part to changes the company made to its responsible scaling policy days before the release of Claude 4 models, which saw Anthropic remove its commitments to tackle insider threats by the time it released models of that caliber. 'That's very bad process,' Campos says. Anthropic did not immediately respond to a request for comment. The study's authors also said that its methodology had become more detailed since last October, which accounts for some of the differences in scoring. The companies that improved their scores the most were xAI, which scored 18% compared to 0% in October; and Meta, which scored 22% compared to its previous score of 14%. The FLI's study was broader—looking not only at risk management practices, but also companies' approaches to current harms, existential safety, governance, and information sharing. A panel of six independent experts scored each company based on a review of publicly available material such as policies, research papers, and news reports, together with additional nonpublic data that companies were given the opportunity to provide. The highest grade was scored by Anthropic (a C plus). OpenAI scored a C, and Google scored a C minus. (xAI and Meta both scored D.) However, in FLI's scores for each company's approach to 'existential safety,' every company scored D or below. 'They're all saying: we want to build superintelligent machines that can outsmart humans in every which way, and nonetheless, they don't have a plan for how they're going to control this stuff,' Tegmark says.

Yahoo

16-05-2025

Business
Yahoo

xAI blames Grok's obsession with white genocide on an 'unauthorized modification'

xAI blamed an "unauthorized modification" for a bug in its AI-powered Grok chatbot that caused Grok to repeatedly refer to "white genocide in South Africa" when invoked in certain contexts on X. On Wednesday, Grok began replying to dozens of posts on X with information about white genocide in South Africa, even in response to unrelated subjects. The strange replies stemmed from the X account for Grok, which responds to users with AI-generated posts whenever a person tags "@grok." According to a post Thursday from xAI's official X account, a change was made Wednesday morning to the Grok bot's system prompt — the high-level instructions that guide the bot's behavior — that directed Grok to provide a "specific response" on a "political topic." xAI says that the tweak "violated [its] internal policies and core values," and that the company has "conducted a thorough investigation." It's the second time xAI has publicly acknowledged an unauthorized change to Grok's code caused the AI to respond in controversial ways. In February, Grok briefly censored unflattering mentions of Donald Trump and Elon Musk, the billionaire founder of xAI and owner of X. Igor Babuschkin, an xAI engineering lead, said that Grok had been instructed by a rogue employee to ignore sources that mentioned Musk or Trump spreading misinformation, and that xAI reverted the change as soon as users began pointing it out. xAI said on Thursday that it's going to make several changes to prevent similar incidents from occurring in the future. Beginning today, xAI will publish Grok's system prompts on GitHub as well as a changelog. The company says it'll also "put in place additional checks and measures" to ensure that xAI employees can't modify the system prompt without review and establish a "24/7 monitoring team to respond to incidents with Grok's answers that are not caught by automated systems." Despite Musk's frequent warnings of the dangers of AI gone unchecked, xAI has a poor AI safety track record. A recent report found that Grok would undress photos of women when asked. The chatbot can also be considerably more crass than AI like Google's Gemini and ChatGPT, cursing without much restraint to speak of. A study by SaferAI, a nonprofit aiming to improve the accountability of AI labs, found xAI ranks poorly on safety among its peers, owing to its "very weak" risk management practices. Earlier this month, xAI missed a self-imposed deadline to publish a finalized AI safety framework. This article originally appeared on TechCrunch at Sign in to access your portfolio

Yahoo

14-05-2025

Business
Yahoo

xAI's promised safety report is MIA

Elon Musk's AI company, xAI, has missed a self-imposed deadline to publish a finalized AI safety framework, as noted by watchdog group The Midas Project. xAI isn't exactly known for its strong commitments to AI safety as it's commonly understood. A recent report found that the company's AI chatbot, Grok, would undress photos of women when asked. Grok can also be considerably more crass than chatbots like Gemini and ChatGPT, cursing without much restraint to speak of. Nonetheless, in February at the AI Seoul Summit, a global gathering of AI leaders and stakeholders, xAI published a draft framework outlining the company's approach to AI safety. The eight-page document laid out xAI's safety priorities and philosophy, including the company's benchmarking protocols and AI model deployment considerations. As The Midas Project noted in the blog post on Tuesday, however, the draft only applied to unspecified future AI models "not currently in development." Moreover, it failed to articulate how xAI would identify and implement risk mitigations, a core component of a document the company signed at the AI Seoul Summit. In the draft, xAI said that it planned to release a revised version of its safety policy "within three months" — by May 10. The deadline came and went without acknowledgement on xAI's official channels. Despite Musk's frequent warnings of the dangers of AI gone unchecked, xAI has a poor AI safety track record. A recent study by SaferAI, a nonprofit aiming to improve the accountability of AI labs, found that xAI ranks poorly among its peers, owing to its "very weak" risk management practices. That's not to suggest other AI labs are faring dramatically better. In recent months, xAI rivals including Google and OpenAI have rushed safety testing and have been slow to publish model safety reports (or skipped publishing reports altogether). Some experts have expressed concern that the seeming deprioritization of safety efforts is coming at a time when AI is more capable — and thus potentially dangerous — than ever. This article originally appeared on TechCrunch at Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

Latest news with #SaferAI

Top AI Firms Fall Short on Safety, New Studies Find

xAI blames Grok's obsession with white genocide on an 'unauthorized modification'

xAI's promised safety report is MIA

Get Started Now: Download the App