One of Google's recent Gemini AI models scores worse on safety

09-05-2025

A recently released Google AI model scores worse on certain safety tests than its predecessor, according to the company's internal benchmarking.
In a technical report published this week, Google reveals that its Gemini 2.5 Flash model is more likely to generate text that violates its safety guidelines than Gemini 2.0 Flash. On two metrics, "text-to-text safety" and "image-to-text safety," Gemini 2.5 Flash regresses 4.1% and 9.6%, respectively.
Text-to-text safety measures how frequently a model violates Google's guidelines given a prompt, while image-to-text safety evaluates how closely the model adheres to these boundaries when prompted using an image. Both tests are automated, not human-supervised.
In an emailed statement, a Google spokesperson confirmed that Gemini 2.5 Flash "performs worse on text-to-text and image-to-text safety."
These surprising benchmark results come as AI companies move to make their models more permissive — in other words, less likely to refuse to respond to controversial or sensitive subjects. For its latest crop of Llama models, Meta said it tuned the models not to endorse "some views over others" and to reply to more "debated" political prompts. OpenAI said earlier this year that it would tweak future models to not take an editorial stance and offer multiple perspectives on controversial topics.
Sometimes, those permissiveness efforts have backfired. TechCrunch reported Monday that the default model powering OpenAI's ChatGPT allowed minors to generate erotic conversations. OpenAI blamed the behavior on a "bug."
According to Google's technical report, Gemini 2.5 Flash, which is still in preview, follows instructions more faithfully than Gemini 2.0 Flash, inclusive of instructions that cross problematic lines. The company claims that the regressions can be attributed partly to false positives, but it also admits that Gemini 2.5 Flash sometimes generates "violative content" when explicitly asked.
"Naturally, there is tension between [instruction following] on sensitive topics and safety policy violations, which is reflected across our evaluations," reads the report.
Scores from SpeechMap, a benchmark that probes how models respond to sensitive and controversial prompts, also suggest that Gemini 2.5 Flash is far less likely to refuse to answer contentious questions than Gemini 2.0 Flash. TechCrunch's testing of the model via AI platform OpenRouter found that it'll uncomplainingly write essays in support of replacing human judges with AI, weakening due process protections in the U.S., and implementing widespread warrantless government surveillance programs.
Thomas Woodside, co-founder of the Secure AI Project, said the limited details Google gave in its technical report demonstrates the need for more transparency in model testing.
"There's a trade-off between instruction-following and policy following, because some users may ask for content that would violate policies," Woodside told TechCrunch. "In this case, Google's latest Flash model complies with instructions more while also violating policies more. Google doesn't provide much detail on the specific cases where policies were violated, although they say they are not severe. Without knowing more, it's hard for independent analysts to know whether there's a problem."
Google has come under fire for its model safety reporting practices before.
It took the company weeks to publish a technical report for its most capable model, Gemini 2.5 Pro. When the report eventually was published, it initially omitted key safety testing details.
On Monday, Google released a more detailed report with additional safety information.

Hashtags

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

AI and Virtual Coaching Are Transforming Reemployment - Big Interview Shows What's Possible

Miami Herald

4 minutes ago

Miami Herald

AI and Virtual Coaching Are Transforming Reemployment - Big Interview Shows What's Possible

NEW YORK CITY, NY / ACCESS Newswire / August 21, 2025 / As the White House releases its America's Talent Strategy: Building the Workforce for the Golden Age, state agencies are facing an urgent question: how to prepare workers for a job market being reshaped at unprecedented speed by AI and automation. The challenge is clear. By 2030, 14% of workers will need to change jobs due to AI and automation, and 80% of the workforce will see disruption to their daily tasks from large language models like ChatGPT (McKinsey). Reemployment programs, built for a pre-AI economy, are struggling to keep pace. The paradox is that the same technologies fueling disruption may also be the key to recovery. Across the country, workforce agencies are piloting AI-powered assessments, virtual reality training, and AI-driven interview coaching to move people back into work faster and with better job matches. "Over the last decade, I've watched hiring trends shift-but nothing compares to the pace of change we're seeing now," said Pamela Skillings, co-founder and chief coach at Big Interview. "AI isn't just changing what jobs exist, it's changing how quickly people need to adapt. Agencies that integrate these tools now will be the ones that keep their workers competitive." AI-enabled resume and interview tools-like those built into Big Interview-are helping states like Idaho provide personalized interview feedback to thousands of unemployed residents annually, without increasing staff workload. These same approaches are in line with the White House's America's Talent Strategy Pillar IV and V priorities: Accountability: Linking investments to clear employment outcomes, shortening unemployment, and redirecting resources to programs that prove & Innovation: Using AI-era tools to pilot new models, adapt training to labor market needs, and scale access to rural and underserved communities. For state agencies, the impact is measurable. Big Interview's AI-driven mock interview simulations and resume optimization tools have already supported over two million job seekers worldwide, helping them land jobs up to 5x faster than the national average. "When you can serve thousands of people with personalized interview feedback-without adding staff-you're not just innovating, you're solving real problems," said Steve Ruder, Vice President at Big Interview. "We've seen measurable results in multiple states, and the technology is proven to scale." As agencies evaluate their next steps under the America's Talent Strategy report, one priority stands out: evaluate and pilot AI-driven tools that can scale impact, measure results, and adapt quickly to changing labor market needs. Contact Information Steve Ruder Vice Presidentsteve@

Made By Google 2025 Announces Pixel 10, Pixel 10 Pro Fold & More

Black America Web

4 minutes ago

Black America Web

Made By Google 2025 Announces Pixel 10, Pixel 10 Pro Fold & More

Google's annual Made By Google event kicked off in Brooklyn, N.Y., on Wednesday (August 20), and there are plenty of exciting developments on deck as 2025 rolls on. This year's Made By Google naturally put a spotlight on new smartphone devices and other technological advancements. This year's Made By Google kicked off with a rolling montage of celebrity figures such as Alex Cooper, The Jonas Brothers, Giannis Antetokounmpo, Lando Norris, A'Ja Wilson, and Stephen Curry, among others. Jimmy Fallon served as the MC for the event, cheekily remarking that he's going to keep a hold on Google's upcoming flagship device, which we'll discuss in a moment. At the top of the event was the announcement of the Pixel 10, which Fallon used as part of the montage bit. This year's Pixel improves on its impressive build with updated AI-powered features, improved power under the hood, and, for the first time, a third camera that boasts a telephoto lens for long-distance snaps. The Pixel 10, like its predecessor, the Pixel 9, will come in a standard version, a Pro version, and the larger Pro XL. The Pixel 10 starts for $799, the Pro at $999, and the Pro XL at $1199. We care about your data. See our privacy policy. Also in the new device lineup is the Pixel 10 Pro Fold, improving upon the design of the Pixel 9 Pro Fold with an improved hinge system, brighter display, and all the innovations one can expect from the Pixel lineup. The device's starting price is $1799. Smartwatches are still very much a top item in the Google ecosystem, and the Pixel Watch 4 is a leap for the lineup. With a new inside and out redesign, the Watch 4 will also debut its Satellite Communications feature for emergency contact needs. The Watch 4's 41 mm price is set at $349 for Wi-Fi, and $449 for LTE. For the 45 mm version, the prices are $399 for Wi-Fi and $499 for LTE. Source: Google / Google For shoppers on a budget, Google also announced its Pixel Buds 2a, with seven hours of battery life and 24 hours of battery life in its charging case. The buds also come equipped with Gemini Live for AI-assisted needs. The price for the Pixel Buds 2a starts at $129. Speaking of Curry, he was recently brought on as Google's 'Performance Adviser,' adding his sports and tech know-how to what the brand hopes to achieve with its new products, including the Gemini AI platform. To learn more about all the new devices announced at Made By Google today, click here. Watch the full presentation with Jimmy Fallon and Google exec Rick Osterloh below. — Photo: Google Made By Google 2025 Announces Pixel 10, Pixel 10 Pro Fold & More was originally published on

Google's AI Mode expands globally, adds new agentic features

TechCrunch

4 minutes ago

TechCrunch

Google's AI Mode expands globally, adds new agentic features

Google is launching a global expansion of AI Mode, its experimental feature that allows users to ask complex questions and follow-ups to dig deeper on a topic directly within Search, the company announced on Thursday. The tech giant is also bringing new agentic and personalized capabilities to the feature. As part of the expansion, Google is bringing AI Mode to 180 new countries in English. Up until now, it's only been available to users in the U.S., U.K., and India. Google plans to bring the feature to more languages and regions soon. In terms of the new agentic features, users can now use AI Mode to find restaurant reservations, and in the future, they'll be able to find local service appointments and event tickets. Users can request dinner reservations based on multiple preferences, such as party size, date, time, location, and preferred cuisine. AI Mode will then search across different reservation platforms to find real-time availability for restaurants that match the inquiry. It then surfaces a curated list of options to choose from. This new capability is rolling out for Google AI Ultra subscribers in the U.S. through the 'Agentic capabilities in AI Mode' experiment in Labs, Google's experimental arm. (Ultra is Google's highest-end plan, at $249.99 per month.) Image Credits:Google Google says that U.S. users in the AI Mode experiment will also now see search results tailored to their individual preferences and interests. The tech giant is starting with dining-related topics for this capability. For example, if someone searches, 'I only have an hour, need a quick lunch spot, any suggestions?' AI Mode will use their past conversations, along with places they've searched for or clicked on in Search and Maps, to offer more relevant suggestions. So, if AI Mode infers that you like Italian food and places with outdoor seating, you'll get results suggesting options with these preferences. Google notes that users can adjust their personalization settings in their Google Account. Techcrunch event Tech and VC heavyweights join the Disrupt 2025 agenda Netflix, ElevenLabs, Wayve, Sequoia Capital, Elad Gil — just a few of the heavy hitters joining the Disrupt 2025 agenda. They're here to deliver the insights that fuel startup growth and sharpen your edge. Don't miss the 20th anniversary of TechCrunch Disrupt, and a chance to learn from the top voices in tech — grab your ticket now and save up to $600+ before prices rise. Tech and VC heavyweights join the Disrupt 2025 agenda Netflix, ElevenLabs, Wayve, Sequoia Capital — just a few of the heavy hitters joining the Disrupt 2025 agenda. They're here to deliver the insights that fuel startup growth and sharpen your edge. Don't miss the 20th anniversary of TechCrunch Disrupt, and a chance to learn from the top voices in tech — grab your ticket now and save up to $675 before prices rise. San Francisco | REGISTER NOW In addition, AI Mode now lets users share and collaborate with others. A new 'Share' button lets users send an AI Mode response to others, allowing them to jump into the conversation. Google says this could be helpful in cases where you need to collaborate with someone else, such as planning a trip or a birthday party.

One of Google's recent Gemini AI models scores worse on safety

Hashtags

Try Our AI Features

Comments

Related Articles

AI and Virtual Coaching Are Transforming Reemployment - Big Interview Shows What's Possible

Made By Google 2025 Announces Pixel 10, Pixel 10 Pro Fold & More

Google's AI Mode expands globally, adds new agentic features

Get Started Now: Download the App