logo
One of Google's recent Gemini AI models scores worse on safety

One of Google's recent Gemini AI models scores worse on safety

Yahoo09-05-2025

A recently released Google AI model scores worse on certain safety tests than its predecessor, according to the company's internal benchmarking.
In a technical report published this week, Google reveals that its Gemini 2.5 Flash model is more likely to generate text that violates its safety guidelines than Gemini 2.0 Flash. On two metrics, "text-to-text safety" and "image-to-text safety," Gemini 2.5 Flash regresses 4.1% and 9.6%, respectively.
Text-to-text safety measures how frequently a model violates Google's guidelines given a prompt, while image-to-text safety evaluates how closely the model adheres to these boundaries when prompted using an image. Both tests are automated, not human-supervised.
In an emailed statement, a Google spokesperson confirmed that Gemini 2.5 Flash "performs worse on text-to-text and image-to-text safety."
These surprising benchmark results come as AI companies move to make their models more permissive — in other words, less likely to refuse to respond to controversial or sensitive subjects. For its latest crop of Llama models, Meta said it tuned the models not to endorse "some views over others" and to reply to more "debated" political prompts. OpenAI said earlier this year that it would tweak future models to not take an editorial stance and offer multiple perspectives on controversial topics.
Sometimes, those permissiveness efforts have backfired. TechCrunch reported Monday that the default model powering OpenAI's ChatGPT allowed minors to generate erotic conversations. OpenAI blamed the behavior on a "bug."
According to Google's technical report, Gemini 2.5 Flash, which is still in preview, follows instructions more faithfully than Gemini 2.0 Flash, inclusive of instructions that cross problematic lines. The company claims that the regressions can be attributed partly to false positives, but it also admits that Gemini 2.5 Flash sometimes generates "violative content" when explicitly asked.
"Naturally, there is tension between [instruction following] on sensitive topics and safety policy violations, which is reflected across our evaluations," reads the report.
Scores from SpeechMap, a benchmark that probes how models respond to sensitive and controversial prompts, also suggest that Gemini 2.5 Flash is far less likely to refuse to answer contentious questions than Gemini 2.0 Flash. TechCrunch's testing of the model via AI platform OpenRouter found that it'll uncomplainingly write essays in support of replacing human judges with AI, weakening due process protections in the U.S., and implementing widespread warrantless government surveillance programs.
Thomas Woodside, co-founder of the Secure AI Project, said the limited details Google gave in its technical report demonstrates the need for more transparency in model testing.
"There's a trade-off between instruction-following and policy following, because some users may ask for content that would violate policies," Woodside told TechCrunch. "In this case, Google's latest Flash model complies with instructions more while also violating policies more. Google doesn't provide much detail on the specific cases where policies were violated, although they say they are not severe. Without knowing more, it's hard for independent analysts to know whether there's a problem."
Google has come under fire for its model safety reporting practices before.
It took the company weeks to publish a technical report for its most capable model, Gemini 2.5 Pro. When the report eventually was published, it initially omitted key safety testing details.
On Monday, Google released a more detailed report with additional safety information.

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Boosting Shopify Speed & Performance Optimization With the Help of Shop Gait
Boosting Shopify Speed & Performance Optimization With the Help of Shop Gait

Time Business News

time9 minutes ago

  • Time Business News

Boosting Shopify Speed & Performance Optimization With the Help of Shop Gait

Every millisecond counts in an eCommerce business, and your Shopify store is no exception. Frustrated users, low conversions, poor search engine rankings, and even worse are the results of slow-loading websites. There is no way to accomplish desired outcomes if optimization is a mere good-to-have feature. It is fundamental to achieve exceptional outcomes. At Shop Gait, we have pieces of advice to ease your journey. Offering Shopify speed & performance optimization guarantees that no further optimization is needed at the user end. It improves ease of use, search engine ranking, and ultimately, sales. This blog will help you with the analysis of the Shopify Support & Maintenance Service and the Shopify API Integration Service to a very advanced level of optimization. Think of a website you would like to visit, but takes the life out of you waiting to load. By all means, a website that loads quickly is user-friendly and highly likely to improve customer satisfaction. In most cases, customers will not leave the website after loading if it has multiple pages. Along with other loading and navigational features, your website's speed impacts its SEO ranking. This means Shopify stores that load slowly are losing potential site visits. This is because Google algorithmically gives preference to sites that load faster relative to others. Therefore, if you have a slow Shopify store, implement Shopify Speed & Performance Optimization, and Shopify Support & Maintenance Service. With that in mind, here are the most common reasons that can help in Shopify Speed & Performance Optimization: – Non-optimized, oversized images – Unoptimized Shopify themes – Excessive use of applications (apps) – Poor coding – Optimization of content delivery is absent Knowing these reasons puts you ahead in trying to improve your store's performance. If you're looking to improve your store's website, evaluating your current statistics is the first step. Google PageSpeed Insights, GTmetrix, and the integrated tools of Shopify can serve as a foundation for Shopify Speed & Performance Optimization. Aside from the services offered, these tools also provide Shopify Support & Maintenance Service, and an insightful explanation of the sources of lagging performance. The metrics that are most analyzed when accessing website performance include: First Contentful Paint (FCP)—the time it takes from the moment a user initiates the page load until the first piece of content appears. Time To Interactive (TTI)—how much time it takes till the webpage becomes completely ready for interaction. Largest Contentful Paint (LCP)— time required till the biggest portion of the webpage is loaded Cumulative Layout Shift (CLS)— Searches for disrupting layout shifts that are not expected and pose a risk to user experience. These metrics help you understand the current state of your store and the most pressing areas for improvement. Image optimization deserves attention because images are often the most time-consuming element on a page. Choose the Most Optimal Formats: Formats such as WebP provide unparalleled quality lossless compression. Formats such as WebP provide unparalleled quality lossless compression. Compress Image Files: Use TinyPNG to decrease the file size without losing clarity. Alternatively, there's also the option of using Shopify Apps. Use TinyPNG to decrease the file size without losing clarity. Alternatively, there's also the option of using Shopify Apps. Use Lazy Loading: Only the images visible on the user's screen will be loaded, and this will decrease both load time and speed. Shopify theme is critical to the performance of your store. Select a Lightweight Theme: Select a responsive and fast theme. Shopify offers the free Dawn theme and other paid options such as Turbo. Select a responsive and fast theme. Shopify offers the free Dawn theme and other paid options such as Turbo. Remove Unused Features: Get rid of all unused animations, 3rd party widgets, and integrations that clutter your theme. Get rid of all unused animations, 3rd party widgets, and integrations that clutter your theme. Clean Up Theme Code: Reduce the CSS and JavaScript code of the theme to remove bloat that is slowing your site. While apps can improve the utility of your store, overusing them can severely cripple performance: Audit Your Apps: Evaluate your app installations regularly and promptly uninstall those that are no longer useful. Evaluate your app installations regularly and promptly uninstall those that are no longer useful. Choose Optimized Apps: Assure that the reviews of an app you want to install are favorable and that it is optimized to have minimal effect on your store's speed. Another aspect of enhancing performance is efficient coding: Minify Resources: Reduce the size and loading time of files by minifying CSS, JavaScript, and HTML files. Reduce the size and loading time of files by minifying CSS, JavaScript, and HTML files. Defer Non-Critical Resources: Load only the most important elements first, deferring scripts that are not immediately needed. Load only the most important elements first, deferring scripts that are not immediately needed. Enable Browser Caching: Files like images, CSS files, and HTML can be kept on the user's computer via caching, making future visits faster. A Content Delivery Network (CDN) refers to a network of servers distributed globally for serving content from the geographical region closest to the user. Shopify already has a built-in CDN that can be optimized by: Integrating with external CDNs such as Cloudflare for redundancy and speed. Exploring the CDN's platform for other options, such as caching for performance boosts. Database Optimization Unused data in your Shopify database can contribute to lag: Regularly remove old or inactive products, images, and customer records. Utilize Shopify apps like Matrixify to clean up and organize your inventory data. With most of the traffic coming from mobile devices, it is essential to optimize mobile access: Responsive Design: Check that your store is mobile-enabled with its texts, buttons, and small diagrams arranged to suit smaller screens. Check that your store is mobile-enabled with its texts, buttons, and small diagrams arranged to suit smaller screens. Accelerated Mobile Pages (AMP): Adding AMP to serve light mobile pages that load without delay. If you use Shopify Plus, you have access to sophisticated methods and tools: Script Editor: Implement the Shopify Plus Script Editor for personalized checkout workflows and enhance performance. Implement the Shopify Plus Script Editor for personalized checkout workflows and enhance performance. Automated Testing: Regularly execute A/B testing to evaluate loading time and optimize it to the best possible level. Regularly execute A/B testing to evaluate loading time and optimize it to the best possible level. API Integration Services: Utilize the Shopify API Integration Services for integration with other platforms, which enhances operational efficiency by automating manual workflows. Ensuring that your Shopify store runs at peak levels of speed and performance is critical for competitiveness, enhances user experience, and contributes positively to SEO for professionals? Shop Gait offers unique, tailored Shopify Support & Maintenance Services to enhance store performance. We also provide an equally capable Shopify API Integration Service, through which our professionals will enable seamless integration to bring your store to pro status. TIME BUSINESS NEWS

Will $50,000 Invested in Nvidia Stock Be Worth $1 Million in 10 Years?
Will $50,000 Invested in Nvidia Stock Be Worth $1 Million in 10 Years?

Yahoo

time12 minutes ago

  • Yahoo

Will $50,000 Invested in Nvidia Stock Be Worth $1 Million in 10 Years?

Nvidia shares are up 850% since ChatGPT sparked the artificial intelligence (AI) boom, but most Wall Street analysts still recommend buying the stock. The company is the market leader in AI accelerator chips, but its true strength lies in vertical integration that spans hardware and software products. Seven stocks in the S&P 500 generated such colossal returns in the last decade that they would have turned $50,000 into $1 million. 10 stocks we like better than Nvidia › Nvidia (NASDAQ: NVDA) has been a cornerstone of the artificial intelligence (AI) trade for several years. Its share price has increased 850% since January 2023, a period that roughly coincides with the launch of ChatGPT. But Wall Street is still overwhelmingly bullish on the semiconductor company. Angelo Zino at CFRA Research thinks Nvidia "will be the most important company to our civilization over the next decade." More broadly, among 73 analysts following Nvidia, the median 12-month target price is $175 per share. That implies 25% upside from its current share price of $140. Could Nvidia stock turn $50,000 into $1 million over the next decade? Here are my thoughts. What sets Nvidia apart is vertical integration. The company has over 90% market share in data center graphics processing units (GPUs), chips that accelerate complex workloads such as artificial intelligence (AI). But the company supplements its GPUs with adjacent hardware like CPUs, interconnects, and networking equipment. Nvidia also develops software products. AI Enterprise is a suite of tools, code libraries, and pretrained models that streamline the development of AI applications for use cases like autonomous robots, conversational agents, and optimization systems. CrowdStrike uses those tools to power threat detection capabilities on its cybersecurity platform. Similarly, Omniverse is a software platform that supports 3D application development. It also serves as a simulation engine that lets engineers generate synthetic data for developing machine learning models. Amazon uses the Omniverse platform to optimize warehouse design and train fulfillment center robots. Nvidia frequently sets performance records at the MLPerf benchmarks, objective tests that evaluate AI systems on training and inference workloads. That is an important competitive advantage: Nvidia builds the best AI accelerators on the market. But vertical integration reinforces that advantage by letting the company design entire data center systems with the "lowest total cost of ownership," according to CEO Jensen Huang. Grand View Research says spending on AI hardware, software, and services will increase at 35.9% annually through 2030. Nvidia has a good shot at matching that growth rate. Indeed, Wall Street expects earnings to grow at 40% annually through the fiscal year ending January 2027. That makes the current valuation of 44 times earnings seem fair. Nvidia shares would need to increase 1,900% (20-fold) in the next decade to turn $50,000 into $1 million. Returns of that magnitude are theoretically possible in that time frame. In fact, seven stocks currently in S&P 500 (SNPINDEX: ^GSPC) hit that mark in the last decade, as listed: Nvidia: +25,700% Advanced Micro Devices: +4,980% Axon Enterprise: +2,380% Texas Pacific Land: 2,110% Arista Networks: 1,950% Tesla: 1,920% Fair Isaac: 1,900% However, while 20-fold returns are theoretically possible, Nvidia has virtually no chance of hitting that mark in the next decade. The company is already worth $3.4 trillion, meaning its market value would hit $68 trillion if the stock increased 20 times. That seems highly unlikely when the entire S&P 500 is only worth $48 trillion today. Nevertheless, Nvidia is still a worthwhile investment. AI will likely be the most transformative technology in history, and the company is well positioned to benefit as demand for AI infrastructure increases. Potential catalysts include generative AI, autonomous vehicles, and humanoid robots. Also, Nvidia has a burgeoning software business that may evolve into a significant source of revenue as those catalysts take shape. Before you buy stock in Nvidia, consider this: The Motley Fool Stock Advisor analyst team just identified what they believe are the for investors to buy now… and Nvidia wasn't one of them. The 10 stocks that made the cut could produce monster returns in the coming years. Consider when Netflix made this list on December 17, 2004... if you invested $1,000 at the time of our recommendation, you'd have $674,395!* Or when Nvidia made this list on April 15, 2005... if you invested $1,000 at the time of our recommendation, you'd have $858,011!* Now, it's worth noting Stock Advisor's total average return is 997% — a market-crushing outperformance compared to 172% for the S&P 500. Don't miss out on the latest top 10 list, available when you join . See the 10 stocks » *Stock Advisor returns as of June 2, 2025 John Mackey, former CEO of Whole Foods Market, an Amazon subsidiary, is a member of The Motley Fool's board of directors. Trevor Jennewine has positions in Amazon, Arista Networks, Axon Enterprise, CrowdStrike, Nvidia, and Tesla. The Motley Fool has positions in and recommends Advanced Micro Devices, Amazon, Arista Networks, Axon Enterprise, CrowdStrike, Nvidia, and Tesla. The Motley Fool recommends Fair Isaac. The Motley Fool has a disclosure policy. Will $50,000 Invested in Nvidia Stock Be Worth $1 Million in 10 Years? was originally published by The Motley Fool Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

NotebookLM Was Already My Favorite AI Tool, but New Features Keep Making It Even Better
NotebookLM Was Already My Favorite AI Tool, but New Features Keep Making It Even Better

CNET

time20 minutes ago

  • CNET

NotebookLM Was Already My Favorite AI Tool, but New Features Keep Making It Even Better

NotebookLM has always been a fun idea -- it's kind of a mini-LLM for all of your personal documents, or really any documents you want to feed it. After taking another look recently, it's definitely more than a diversion. It's become my favorite AI tool ever and something I use nearly every day. Powered by Google's Gemini AI, NotebookLM breaks down complex subjects into an easy-to-understand format and helps you brainstorm new ideas. There's now a mobile app for iOS and Android, and new features were just announced during Google I/O earlier this month. It keeps getting better without feeling like it's becoming overstuffed with features just for the sake of it. NotebookLM isn't just Google Keep stuffed with AI, nor is it just a chatbot that can take notes. It's both and neither. Instead of asking questions to Gemini, only for it to find an answer from the ether of the internet, NotebookLM will only search through the sources that you provide it. It's a dead simple concept that feels like one of the most practical uses of AI, giving way to the perfect study buddy for classes or work. And Google didn't stop there. Now it can do so much more, and will reward your poking around to see what it can do for you. And features like its impressive Audio Overviews have since trickled down into Gemini itself, allowing it to be used in a much wider set of Google's products. Below, I'll cover some of NotebookLM's most interesting features (including the newly announced ones) and how it became one of my favorite AI tools to use. For more, check out Google's smart glasses plans with AndroidXR. What is NotebookLM? NotebookLM is a Gemini-powered note-taking and research assistant tool that can be used in a multitude of ways. It all starts with the sources you feed it, whether they're webpage URLs, YouTube videos or audio clips, allowing you to pull multiple sources together into a cohesive package and bring some organization to your scattered thoughts or notes. The most obvious use case for NotebookLM is using it for school or work. Think of it -- you've kept up with countless classes and typed notes down for every one and even perhaps recorded some lectures. Sifting through everything individually can eventually get you to some semblance of understanding, but what if you could get them to work together? Once you've uploaded your sources, Gemini will get to work to create an overall summary of the material. From there, you can begin asking Gemini questions about specific topics on the sources and information from the sources will be displayed in an easy-to-understand format. This alone may be enough for some people just looking to get the most out of their notes, but that's really just scratching the surface. Available for desktop and mobile NotebookLM's three panel layout NotebookLM/Screenshot by Blake Stimac NotebookLM has been available for a while now on the desktop and is broken into a three-pane layout, consisting of Source, Chat and Studio panels. Both the Source and Studio panels are collapsible, so you can have a full-screen chat experience if you prefer. While the Source and Chat panels are pretty self-explanatory, the Studio panel is where magic can happen (though some of the features can also be created directly from the Chat panel). This is where you can get the most out of your NotebookLM experience. The NotebookLM app is like having a data alchemist in your pocket The mobile app for Android and iOS launched the day before Google I/O 2025. Blake Stimac/CNET Those familiar with the desktop experience will feel right at home with the new mobile apps for iOS and Android. The streamlined app allows you to switch between the Source, Chat and Studio panels via a menu at the bottom. When you go to the view that shows all of your notebooks, you'll see tabs for Recent, Shared, Title and Downloaded. While not everything is on the app yet, it's likely just a matter of time before it matches the web version's full functionality. Audio Overviews If you didn't hear about NotebookLM when it was first announced, you likely did when Audio Overviews were released for it. Once you have at least one source uploaded, you can then opt to generate an Audio Overview, which will provide a "deep dive" on the source material. These overviews are created by none other than Gemini, and it's not just a quick summary of your material in audio format -- it's a full-blown podcast with two "hosts" that break down complex topics into easy-to-understand pieces of information. They're incredibly effective, too, often asking each other questions to dismantle certain topics. The default length of an Audio Overview will vary depending on how much material there is to go over and the complexity of the topic -- though I'm sure there are other factors at play. In my testing, a very short piece of text created a five-minute audio clip, whereas two lengthier and more dense Google Docs documents I uploaded created an 18-minute Overview. If you want a little more control on the length for your Audio Overview, you're in luck. Announced in a blog post during Google I/O earlier this month, users now have three options to choose from: shorter, default and longer. This is perfect if you either want to have a short and dense podcast-like experience of if you want to get into the nitty gritty on a subject with a longer Audio Overview. You can interact with your AI podcasters It gets even better. Last December, NotebookLM got a new design and new ways to interact with Audio Overviews. The customize button allows you to guide the conversation so that key points are covered. Type in your directive and then generate your Audio Overview. Now, if you want to make this feature even more interactive, you can choose the Interactive mode, which is still in beta, to join the conversation. The clip will play, and if you have a particular question in response to something that's said, you can click the join button. Once you do, the speakers will pause and acknowledge your presence and ask you to chime in with thoughts or questions, and you'll get a reply. I wanted to try something a little different, so I threw in the lyrics of a song as the only source, and the AI podcast duo began to dismantle the motivations and emotions behind the words. I used the join feature to point out a detail in the lyrics they didn't touch on, and the two began to dissect what my suggestion meant in the context of the writing. They then began linking the theme to other portions of the text. It was impressive to watch: They handled the emotional weight of the song so well, and tactfully at that. Mind Maps Generating a Mind Map is just one of several powerful features from NotebookLM Google/Screenshot by Blake Stimac I'd heard interesting things about NotebookLM's Mind Map feature, but I wanted to go in blind when I tried it out, so I did a separate test. I took roughly 1,500 words of Homer's Odyssey and made that my only source. I then clicked the Mind Map button, and within seconds, an interactive and categorical breakdown of the text was displayed for me to poke around in. Many of the broken-down sections had subsections for deeper dives, some of which were dedicated to single lines for dissection. Clicking on a category or end-point of the map will open the chat with a prefilled prompt. I chose to dive into the line, "now without remedy," and once clicked, the chat portion of NotebookLM reopened with the prefilled prompt, "Discuss what these sources say about Now without remedy, in the larger context of [the subsection] Alternative (worse)." The full line was displayed, including who said it, what it was in response to and any motivations (or other references) for why the line was said in the text. Study guides and more If the combination of all that Audio Overviews and Mind Maps could do sounds like everything a student might need for the perfect study buddy, NotebookLM has a few other features that will solidify it in that place. Study guides After you've uploaded a source, you can create a quick study guide based on the material that will automatically provide a document with a quiz, potential essay questions, a glossary of key terms and answers for the quiz at the bottom. And if you want, you can even convert the study guide into a source for your notebook. FAQs Whether you're using it for school or want to create a FAQ page for your website, the NotebookLM button generates a series of potentially common questions based on your sources. Timeline If you're looking for a play-by-play sort of timeline, it's built right in. Creating a timeline for the Odyssey excerpt broke down main events in a bulleted list and placed them based on the times mentioned in the material. If an event takes place at an unspecified time, it will appear at the top of the timeline, stating this. A cast of characters for reference is also generated below the timeline of events. Briefing document The briefing document is just what it sounds like, giving you a quick snapshot of the key themes and important events to get someone up to speed. This will include specific quotes from the source and their location. A summary of the material is also created at the bottom of the document. How NotebookLM really 'sold' me I already really liked NotebookLM's concept and execution during its 1.0 days, and revisiting the new features only strengthened my appreciation for it. My testing was mostly for fun and to see how this tool can flex, but using it when I "needed" it helped me really get an idea of how powerful it can be, even for simple things. During a product briefing, I did my typical note-taking: Open a Google Doc, start typing in fragmented thoughts on key points, and hope I could translate what I meant when I needed to refer back to them. I knew I would also receive an official press release, so I wasn't (too) worried about it, but I wanted to put NotebookLM to the test in a real-world situation when I was using it for real -- and not just tinkering, when nearly anything seems impressive when it does what you tell it to. I decided to create a new notebook and make my crude notes (which looked like a series of bad haikus at first glance) the only source, just to see what came out on the other end. Not only did NotebookLM fill in the blanks, but the overall summary read almost as well as the press release I received the following day. I was impressed. It felt like alchemy -- NotebookLM took some fairly unintelligible language and didn't just turn it into something passable, but rather, a pretty impressive description. Funny enough, I've since become a more thorough note-taker, but I'm relieved to know I have something that can save the day if I need it to. Video Overviews are on the way Another feature that was announced during Google I/O was Video Overviews, and it's exactly what it sounds like. There's currently no time frame outside of "coming soon" from the blog post, but it should be a good way to get a more visual experience from your notebooks. We'd previously heard that Video Overviews might be on the way, thanks to some sleuthing from Testing Catalog. The article also mentioned that the ability to make your notebooks publicly available and view an Editor's Picks list of notebooks will eventually make their way to NotebookLM. The Editors Picks feature has yet to rear it's head, but you can indeed now share notebooks directly or make them publicly available for anyone to access. While we're waiting for the new features, here's a preview of a Video Overview below. If you need more from NotebookLM, consider upgrading Most individuals may never have the need to pay for NotebookLM, as the free version is robust enough. But if you're using it for work and need to be able to add more sources or the option to share your notebook with multiple people, NotebookLM Plus is worth considering. It gives you more of everything while introducing more customization, additional privacy and security features as well as analytics. It's worth noting that NotebookLM Plus will also be packaged in with Google's new AI subscriptions. For more, don't miss Google's going all-in on AI video with Flow and Veo 3

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store