logo
#

Latest news with #GeminiDiffusion

Google announces major Gemini AI upgrades & new dev tools
Google announces major Gemini AI upgrades & new dev tools

Techday NZ

time22-05-2025

  • Business
  • Techday NZ

Google announces major Gemini AI upgrades & new dev tools

Google has unveiled a range of updates to its developer products, aimed at improving the process of building artificial intelligence applications. Mat Velloso, Vice President, AI / ML Developer at Google, stated, "We believe developers are the architects of the future. That's why Google I/O is our most anticipated event of the year, and a perfect moment to bring developers together and share our efforts for all the amazing builders out there. In that spirit, we updated Gemini 2.5 Pro Preview with even better coding capabilities a few weeks ago. Today, we're unveiling a new wave of announcements across our developer products, designed to make building transformative AI applications even better." The company introduced an enhanced version of its Gemini 2.5 Flash Preview, described as delivering improved performance on coding and complex reasoning tasks while optimising for speed and efficiency. This model now includes "thought summaries" to increase transparency in its decision-making process, and its forthcoming "thinking budgets" feature is intended to help developers manage costs and exercise more control over model outputs. Both Gemini 2.5 Flash versions and 2.5 Pro are available in preview within Google AI Studio and Vertex AI, with general availability for Flash expected in early June, followed by Pro. Among the new models announced is Gemma 3n, designed to function efficiently on personal devices such as phones, laptops, and tablets. Gemma 3n can process audio, text, image, and video inputs and is available for preview on Google AI Studio and Google AI Edge. Also introduced is Gemini Diffusion, a text model that reportedly generates outputs at five times the speed of Google's previous fastest model while maintaining coding performance. Access to Gemini Diffusion is currently by waitlist. The Lyria RealTime model was also detailed. This experimental interactive music generation tool allows users to create, control, and perform music in real time. Lyria RealTime can be accessed via the Gemini API and trialled through a starter application in Google AI Studio. Several additional variants of the Gemma model family were announced, targeting specific use cases. MedGemma is described as the company's most capable multimodal medical model to date, intended to support developers creating healthcare applications such as medical image analysis. MedGemma is available now via the Health AI Developer Foundations programme. Another upcoming model, SignGemma, is designed to translate sign languages into spoken language text, currently optimised for American Sign Language to English. Google is soliciting feedback from the community to guide further development of SignGemma. Google outlined new features intended to facilitate the development of AI applications. A new, more agentic version of Colab will enable users to instruct the tool in plain language, with Colab subsequently taking actions such as fixing errors and transforming code automatically. Meanwhile, Gemini Code Assist, Google's free AI-coding assistant, and its associated code review agent for GitHub, are now generally available to all developers. These tools are now powered by Gemini 2.5 and will soon offer a two million token context window for standard and enterprise users on Vertex AI. Firebase Studio was presented as a new cloud-based workspace supporting rapid development of AI applications. Notably, Firebase Studio now integrates with Figma via a plugin, supporting the transition from design to app. It can also automatically detect and provision necessary back-end resources. Jules, another tool now generally available, is an asynchronous coding agent that can manage bug backlogs, handle multiple tasks, and develop new features, working directly with GitHub repositories and creating pull requests for project integration. A new offering called Stitch was also announced, designed to generate frontend code and user interface designs from natural language descriptions or image prompts, supporting iterative and conversational design adjustments with easy export to web or design platforms. For those developing with the Gemini API, updates to Google AI Studio were showcased, including native integration with Gemini 2.5 Pro and optimised use with the GenAI SDK for instant generation of web applications from input prompts spanning text, images, or videos. Developers will find new models for generative media alongside enhanced code editor support for prototyping. Additional technical features include proactive video and audio capabilities, affective dialogue responses, and advanced text-to-speech functions that enable control over voice style, accent, and pacing. The model updates also introduce asynchronous function calling to enable non-blocking operations and a Computer Use API that will allow applications to browse the web or utilise other software tools under user direction, initially available to trusted testers. The company is also rolling out URL context, an experimental tool for retrieving and analysing contextual information from web pages, and announcing support for the Model Context Protocol in the Gemini API and SDK, aiming to facilitate the use of a broader range of open-source developer tools.

Google leaders see AGI arriving around 2030
Google leaders see AGI arriving around 2030

Axios

time21-05-2025

  • Business
  • Axios

Google leaders see AGI arriving around 2030

So-called artificial general intelligence (AGI) — widely understood to mean AI that matches or surpasses most human capabilities — is likely to arrive sometime around 2030, Google's co-founder Sergey Brin and Google DeepMind CEO Demis Hassabis said Tuesday. Why it matters: Much of the AI industry now sees AGI as an inevitability, with predictions of its advent ranging from two years on the inside to 10 years on the outside, but there's little consensus on exactly what it will look like or how it will change our lives. Brin made a surprise appearance at Google's I/O developer conference Tuesday, crashing an on-stage interview with Hassabis. The big picture: While much of Google's developer conference focused on the here and now of AI, Brin and Hassabis focused on what it will take to make AGI a reality. Asked whether it will be enough to keep scaling up today's AI models or new techniques will be needed, Hassabis insisted both are key ingredients. "You need to scale to the maximum the techniques that you know about and exploit them to the limit," Hassabis said during the on-stage interview with tech journalist Alex Kantrowitz. "And at the same time, you want to spend a bunch of effort on what's coming next." Brin said he'd guess that algorithmic advances are even more significant than increases in computational power. But, he added, "both of them are coming up now, so we're kind of getting the benefits of both." The big picture: Hassabis predicted the industry will probably need a couple more big breakthroughs to get to AGI — reiterating what he told Axios in December . However, he said that we may already have achieved part of one breakthrough in the form of the reasoning approaches that Google, OpenAI and others have unveiled in recent months. Reasoning models don't respond to prompts immediately but instead do more computing before they spit out an answer. "Like most of us, we get some benefit by thinking before we speak," Brin said — joking that it's something he often has to be reminded of. Between the lines: Google detailed a couple of new approaches Tuesday that, while less flashy than some of the other AI features the company unveiled, hinted at other novel directions. Gemini Diffusion is a new text model that employs the diffusion approach typically used by image generators, "converting random noise into coherent text or code," per a Google blog post. The result, Google says, is a model that can generate text far faster than other approaches. The company also debuted a mode for its models called Deep Think, which works by pursuing multiple approaches to a problem and evaluating which is most promising. What's next: On the timing of AGI, Hassabis and Brin were asked whether they thought it would arrive before or after 2030.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store