Latest news with #SignGemma


India Today
3 days ago
- Business
- India Today
New Google AI tool translates sign language into text, currently in testing phase with launch by year-end
Sign language is essential for many people who have speech impairment. They use it to communicate with people around them but among the regular not many understand it. Now, AI is going to help here as well. Google is working on a AI model called SignGemma that will translate sign language into text. The company says this is its most capable artificial intelligence model to date, designed to translate sign language into spoken text. This new AI model is currently in its testing phase, and is slated for public launch by the end of the first unveiled SignGemma during the keynote at Google I/O, where Gemma Product Manager Gus Martins described it as the company's 'most capable sign language understanding model ever.' Martins noted that, unlike previous attempts at sign language translation, SignGemma stands out for its open model approach and its focus on delivering accurate, real-time translations to users. While the tool is trained to handle various sign languages, Google says the model currently performs best with American Sign Language (ASL) and English.'We're thrilled to announce SignGemma, our groundbreaking open model for sign language understanding, set for release later this year,' Martins said. 'It's the most capable sign language understanding model ever, and we can't wait for developers and Deaf and hard-of-hearing communities to take this foundation and build with it.' Google highlighted that with this tool, the company aims to bridge communication gaps for millions of Deaf and hard-of-hearing individuals to ensure the tool is both effective and respectful of its user base, Google is taking a collaborative approach to its development. The company has extended an open invitation to developers, researchers, and members of the global Deaf and Hard of Hearing communities to participate in early testing and provide feedback."We're thrilled to announce SignGemma, our groundbreaking open model for sign language understanding," reads the official post from DeepMind on X. "Your unique experiences, insights, and needs are crucial as we prepare for launch and beyond, to make SignGemma as useful and impactful as possible."The introduction of SignGemma comes at a time when Google is heavily focused on expanding its AI portfolio. At Google I/O 2025, accessibility took centre stage with the announcement of several new AI-powered features designed to make technology more inclusive for everyone. One of the highlights was the expansion of Gemini AI's integration with Android's TalkBack, which will now provide users AI-generated descriptions for images and allow them to ask follow-up questions about what's on their screen. Google has also introduced updates to Chrome, including automatic Optical Character Recognition (OCR) for scanned PDFs, enabling screen reader users to access, search, and interact with text in documents that were previously inaccessible. For students, on Chromebooks a new accessibility tool called Face Control allows users to control their device with facial gestures and head movements.


Hans India
3 days ago
- Business
- Hans India
Google Unveils SignGemma: AI Tool to Translate Sign Language into Text by Year-End
At Google I/O 2025, the tech giant introduced SignGemma, a powerful AI model designed to translate sign language into spoken text. Currently in its testing phase, this tool is available to developers and selected users, with a broader rollout expected by the end of the year. For millions of Deaf and hard-of-hearing individuals around the world, sign language is a vital means of communication. However, it often presents barriers in daily interactions with those unfamiliar with it. Google's new AI initiative, SignGemma, aims to change that by offering real-time sign language-to-text translations, improving accessibility and inclusion on a global scale. Described as Google's 'most capable sign language understanding model ever,' SignGemma was unveiled by Gemma Product Manager Gus Martins during the keynote. According to Martins, the project stands apart from previous attempts thanks to its open model framework and ability to deliver real-time, accurate translations. 'We're thrilled to announce SignGemma, our groundbreaking open model for sign language understanding, set for release later this year,' Martins said. 'It's the most capable sign language understanding model ever, and we can't wait for developers and Deaf and hard-of-hearing communities to take this foundation and build with it.' At present, SignGemma is most accurate when translating American Sign Language (ASL) into English. However, Google has stated that the model is trained to support a range of sign languages and plans to expand its capabilities over time. The launch of SignGemma is part of a broader push by Google to prioritise accessibility in AI technology. At this year's I/O conference, the company announced several updates focused on inclusivity, including enhanced AI integration in Android's TalkBack feature. Users will now receive AI-generated descriptions of images and be able to ask follow-up questions about what's on their screen, making the Android experience more intuitive for visually impaired users. Additionally, Google has rolled out updates to Chrome, such as automatic Optical Character Recognition (OCR) for scanned PDFs. This makes previously inaccessible documents readable and searchable for screen reader users. On Chromebooks, a new feature called Face Control enables users to navigate their device using facial expressions and head gestures—another step forward in Google's mission to empower every user. To ensure SignGemma is both useful and respectful, Google is adopting a collaborative development approach. The company is actively inviting developers, researchers, and members of the global Deaf and hard-of-hearing communities to test the tool and share feedback. 'We're thrilled to announce SignGemma, our groundbreaking open model for sign language understanding,' read an official post from DeepMind on X. 'Your unique experiences, insights, and needs are crucial as we prepare for launch and beyond, to make SignGemma as useful and impactful as possible.' With SignGemma, Google is not just expanding its AI capabilities—it's building a bridge between the hearing and Deaf communities. As it nears public release, the tool stands to transform communication and redefine accessibility in the digital age.


Techday NZ
22-05-2025
- Business
- Techday NZ
Google announces major Gemini AI upgrades & new dev tools
Google has unveiled a range of updates to its developer products, aimed at improving the process of building artificial intelligence applications. Mat Velloso, Vice President, AI / ML Developer at Google, stated, "We believe developers are the architects of the future. That's why Google I/O is our most anticipated event of the year, and a perfect moment to bring developers together and share our efforts for all the amazing builders out there. In that spirit, we updated Gemini 2.5 Pro Preview with even better coding capabilities a few weeks ago. Today, we're unveiling a new wave of announcements across our developer products, designed to make building transformative AI applications even better." The company introduced an enhanced version of its Gemini 2.5 Flash Preview, described as delivering improved performance on coding and complex reasoning tasks while optimising for speed and efficiency. This model now includes "thought summaries" to increase transparency in its decision-making process, and its forthcoming "thinking budgets" feature is intended to help developers manage costs and exercise more control over model outputs. Both Gemini 2.5 Flash versions and 2.5 Pro are available in preview within Google AI Studio and Vertex AI, with general availability for Flash expected in early June, followed by Pro. Among the new models announced is Gemma 3n, designed to function efficiently on personal devices such as phones, laptops, and tablets. Gemma 3n can process audio, text, image, and video inputs and is available for preview on Google AI Studio and Google AI Edge. Also introduced is Gemini Diffusion, a text model that reportedly generates outputs at five times the speed of Google's previous fastest model while maintaining coding performance. Access to Gemini Diffusion is currently by waitlist. The Lyria RealTime model was also detailed. This experimental interactive music generation tool allows users to create, control, and perform music in real time. Lyria RealTime can be accessed via the Gemini API and trialled through a starter application in Google AI Studio. Several additional variants of the Gemma model family were announced, targeting specific use cases. MedGemma is described as the company's most capable multimodal medical model to date, intended to support developers creating healthcare applications such as medical image analysis. MedGemma is available now via the Health AI Developer Foundations programme. Another upcoming model, SignGemma, is designed to translate sign languages into spoken language text, currently optimised for American Sign Language to English. Google is soliciting feedback from the community to guide further development of SignGemma. Google outlined new features intended to facilitate the development of AI applications. A new, more agentic version of Colab will enable users to instruct the tool in plain language, with Colab subsequently taking actions such as fixing errors and transforming code automatically. Meanwhile, Gemini Code Assist, Google's free AI-coding assistant, and its associated code review agent for GitHub, are now generally available to all developers. These tools are now powered by Gemini 2.5 and will soon offer a two million token context window for standard and enterprise users on Vertex AI. Firebase Studio was presented as a new cloud-based workspace supporting rapid development of AI applications. Notably, Firebase Studio now integrates with Figma via a plugin, supporting the transition from design to app. It can also automatically detect and provision necessary back-end resources. Jules, another tool now generally available, is an asynchronous coding agent that can manage bug backlogs, handle multiple tasks, and develop new features, working directly with GitHub repositories and creating pull requests for project integration. A new offering called Stitch was also announced, designed to generate frontend code and user interface designs from natural language descriptions or image prompts, supporting iterative and conversational design adjustments with easy export to web or design platforms. For those developing with the Gemini API, updates to Google AI Studio were showcased, including native integration with Gemini 2.5 Pro and optimised use with the GenAI SDK for instant generation of web applications from input prompts spanning text, images, or videos. Developers will find new models for generative media alongside enhanced code editor support for prototyping. Additional technical features include proactive video and audio capabilities, affective dialogue responses, and advanced text-to-speech functions that enable control over voice style, accent, and pacing. The model updates also introduce asynchronous function calling to enable non-blocking operations and a Computer Use API that will allow applications to browse the web or utilise other software tools under user direction, initially available to trusted testers. The company is also rolling out URL context, an experimental tool for retrieving and analysing contextual information from web pages, and announcing support for the Model Context Protocol in the Gemini API and SDK, aiming to facilitate the use of a broader range of open-source developer tools.

Yahoo
20-05-2025
- Business
- Yahoo
The latest Google Gemma AI model can run on phones
Google's family of "open" AI models, Gemma, is growing. During Google I/O 2025 on Tuesday, Google took the wraps off Gemma 3n, a model designed to run "smoothly" on phones, laptops, and tablets. Available in preview starting Tuesday, Gemma 3n can handle audio, text, images, and videos, according to Google. Models efficient enough to run offline and without the need for computing in the cloud have gained steam in the AI community in recent years. Not only are they cheaper to use than large models, but they preserve privacy by eliminating the need to transfer data to a remote data center. During a keynote at I/O, Gemma Product Manager Gus Martins said that Gemma 3n can run on devices with less than 2GB of RAM. "Gemma 3n shares the same architecture as Gemini Nano, and is and is engineered for incredible performance," he added. In addition to Gemma 3n, Google is releasing MedGemma through its Health AI Developer Foundations program. According to the company, MedGemma is its most capable open model for analyzing health-related text and images. "MedGemma [is] our [...] collection of open models for multimodal [health] text and image understanding," Martins said. "MedGemma works great across a range of image and text applications, so that developers [...] can adapt the models for their own health apps." Also on the horizon is SignGemma, an open model to translate sign language into spoken-language text. Google says that SignGemma will enable developers to create new apps and integrations for deaf and hard-of-hearing users. "SignGemma is a new family of models trained to translate sign language to spoken-language text, but it's best at American Sign Language and English," Martins said. "It's the most capable sign language understanding model ever, and we can't wait for you — developers and deaf and hard-of-hearing communities — to take this foundation and build with it." Worth noting is that Gemma has been criticized for its custom, non-standard licensing terms, which some developers say have made using the models commercially a risky proposition. That hasn't dissuaded developers from downloading Gemma models tens of millions of times collectively, however. Updated 2:40 p.m. Pacific: Added several quotes from Gemma Product Manager Gus Martins. This article originally appeared on TechCrunch at


TechCrunch
20-05-2025
- Business
- TechCrunch
The latest Google Gemma AI model can run on phones
Google's family of 'open' AI models, Gemma, is growing. During Google I/O 2025 on Tuesday, Google took the wraps off Gemma 3n, a model designed to run 'smoothly' on phones, laptops, and tablets. Available in preview starting Tuesday, Gemma 3n can handle audio, text, images, and videos, according to Google. Models efficient enough to run offline and without the need for computing in the cloud have gained steam in the AI community in recent years. Not only are they cheaper to use than large models, but they preserve privacy by eliminating the need to transfer data to a remote data center. In addition to Gemma 3n, Google is releasing MedGemma through its Health AI Developer Foundations program. According to the company, MedGemma is its most capable open model for analyzing health-related text and images. Also on the horizon is SignGemma, an open model to translate sign language into spoken-language text. Google says that SignGemma will enable developers to create new apps and integrations for deaf and hard-of-hearing users. Worth noting is that Gemma has been criticized for its custom, non-standard licensing terms, which some developers say have made using the models commercially a risky proposition. That hasn't dissuaded developers from downloading Gemma models tens of millions of times collectively, however.