Latest news with #ExpressiveCaptions
Yahoo
15-05-2025
- Yahoo
Google rolls out new AI and accessibility features to Android and Chrome
Google announced on Thursday that it's rolling out new AI and accessibility features to Android and Chrome. Most notably, TalkBack, Android's screen reader, now lets you ask Gemini about what's in images and what's on your screen. Last year, Google brought Gemini's capabilities to TalkBack to give people who are blind or have low vision access to AI-generated descriptions for images, even when Alt text isn't available. Now, people can ask questions and get responses about their images. For example, if a friend texts you a photo a their new guitar, you can get a description of it and ask questions about the brand and color. In addition, you can now get descriptions and ask questions about your whole phone screen. So, if you're shopping in an app, you can ask Gemini about the material of an item you're interested in or if there is a discount available. Google also announced today that it's updating Expressive Captions, Android's real-time captions feature that uses AI to capture what someone says, and how they say it. Google says it's aware that one of the ways people express themselves is by dragging out the sound of their words, which is why it has developed a new duration feature on Expressive Captions. Now, you'll know if a sports announcer is calling out an 'amaaazing shot' or when someone isn't simply saying 'no' but 'nooooo.' You'll also start to see new labels for sounds, such as when a person is whistling or clearing their throat. The update is rolling out in English in the U.S., U.K., Canada, and Australia for devices running Android 15 and above. Google is also making it easier to access PDFs on Chrome. Up until now, you wouldn't be able to use your screen reader to interact with a scanned PDF in your desktop Chrome browser. Now, Chrome automatically recognizes these types of PDFs, allowing you to highlight, copy, and search for text like any other page and use your screen reader to read them. This is thanks to the introduction of Optical Character Recognition (OCR), Google says. Plus, Page Zoom on Chrome on Android now lets you increase the size of the text you see without affecting the webpage layout. You can customize how much you want to zoom in and then choose to apply the preference to all of the pages you visit, or just certain ones. You can access this feature by tapping the three-dot menu in the top right corner of Chrome. Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data


CNET
15-05-2025
- CNET
Gemini Can Now Answer Questions About Images in Android's TalkBack Screen Reader
Google is weaving AI advancements into its accessibility offerings. It's rolling out updates to features across Android and Chrome, including its TalkBack screen reader and Expressive Captions, the company said Thursday. TalkBack, which was first launched in 2009, reads aloud what's on your screen and lets you navigate your device using custom gestures, voice commands or a virtual braille keyboard. Last year, Google integrated Gemini into TalkBack to offer richer and clearer image descriptions. Gemini in TalkBack can answer questions about what's on your screen. Google Now, you can ask Gemini questions via TalkBack to get more information about what's in a photo. So if someone sends you an image and you want more details about what's being shown, you can ask, and Gemini will answer. If you're online shopping and want to know more about the material of a dress, Gemini can respond to your inquiries. It can also answer questions about anything on your screen, such as whether an item is on sale. Additionally, Google is rolling out the next version of Expressive Captions, which uses AI to convey details like intensity of speech and background sounds in videos and livestreams. When the feature launched in December, it included characterizations like capitalized text for phrases spoken with excitement (such as "HAPPY BIRTHDAY!"), as well as descriptions of ambient sounds like applause or music. Expressive Captions can now convey elongated speech. Google Now, Expressive Captions will also convey the duration of a statement, adding letters if a sports announcer says "amaaazing shot," for instance, or if someone in a video says "nooooo." It can also label more sounds like someone whistling or clearing their throat. The update is rolling out in English in the US, UK, Canada and Australia on devices running Android 15 and up. It also just got easier to access PDFs on Chrome. Previously, screen readers couldn't interact with scanned PDFs in a desktop Chrome browser. Now, Optical Character Recognition makes it possible for Chrome to automatically recognize these PDFs, so you can use your screen reader and also highlight, copy and search for text like you would with any other page. Page Zoom lets you increase text size without throwing off a webpage's layout. Google And Page Zoom now lets you enlarge text in Chrome on Android without distorting the webpage's layout, similar to how it works on desktop Chrome. To use the feature, tap the three-dot menu in the upper right corner in Chrome and choose your zoom level. Google's announcement comes on Global Accessibility Awareness Day, for which other tech companies like Apple and TikTok have also shared new features. It arrives hot on the heels of The Android Show: I/O Edition, during which Google unveiled Android 16 and Gemini updates. Next week, the search giant will be hosting its I/O developers conference, which is likely to focus heavily on AI capabilities.

Engadget
15-05-2025
- Entertainment
- Engadget
Android's screen reader can now answer questions about images
Today is Global Accessibility Awareness Day (GAAD), and, as in years past, many tech companies are marking the occasion with the announcement of new assistive features for their ecosystems. Apple got things rolling on Tuesday, and now Google is joining in on the parade. To start, the company has made TalkBack, Android's built-in screen reader, more useful. With the help of one of Google's Gemini models, TalkBack can now answer questions about images displayed on your phone, even they don't have any alt text describing them. "That means the next time a friend texts you a photo of their new guitar, you can get a description and ask follow-up questions about the make and color, or even what else is in the image," explains Google. The fact Gemini can see and understand the image is thanks to the multi-modal capabilities Google built into the model. Additionally, the Q&A functionality works across the entire screen. So, for example, say you're doing some online shopping, you can first ask your phone to describe the color of the piece of clothing you're interested in and then ask if it's on sale. Separately, Google is rolling out a new version of its Expressive Captions. First announced at the end of last year, the feature generates subtitles that attempt to capture the emotion of what's being said. For instance, if you're video chatting with some friends and one of them groans after you make a lame joke, your phone will not only subtitle what they said but it will also include "[groaning]" in the transcription. With the new version of Expressive Captions, the resulting subtitles will reflect when someone drags out the sound of their words. That means the next time you're watching a live soccer match and the announcer yells "goallllllll," their excitement will be properly transcribed. Plus, there will be more labels now for sounds like when someone is clearing their throat. The new version of Expressive Captions is rolling out to English-speaking users in the US, UK, Canada and Australia running Android 15 and above on their phones.


TechCrunch
15-05-2025
- TechCrunch
Google rolls out new AI and accessibility features to Android and Chrome
Google announced on Thursday that it's rolling out new AI and accessibility features to Android and Chrome. Most notably, TalkBack, Android's screen reader, now lets you ask Gemini about what's in images and what's on your screen. Last year, Google brought Gemini's capabilities to TalkBack to give people who are blind or have low vision access to AI-generated descriptions for images, even when Alt text isn't available. Now, people can ask questions and get responses about their images. Image Credits:Google For example, if a friend texts you a photo a their new guitar, you can get a description of it and ask questions about the brand and color. In addition, you can now get descriptions and ask questions about your whole phone screen. So, if you're shopping in an app, you can ask Gemini about the material of an item you're interested in or if there is a discount available. Google also announced today that it's updating Expressive Captions, Android's real-time captions feature that uses AI to capture what someone says, and how they say it. Google says it's aware that one of the ways people express themselves is by dragging out the sound of their words, which is why it has developed a new duration feature on Expressive Captions. Now, you'll know if a sports announcer is calling out an 'amaaazing shot' or when someone isn't simply saying 'no' but 'nooooo.' You'll also start to see new labels for sounds, such as when a person is whistling or clearing their throat. Image Credits:Google The update is rolling out in English in the U.S., U.K., Canada, and Australia for devices running Android 15 and above. Google is also making it easier to access PDFs on Chrome. Up until now, you wouldn't be able to use your screen reader to interact with a scanned PDF in your desktop Chrome browser. Now, Chrome automatically recognizes these types of PDFs, allowing you to highlight, copy, and search for text like any other page and use your screen reader to read them. This is thanks to the introduction of Optical Character Recognition (OCR), Google says. Techcrunch event Join us at TechCrunch Sessions: AI Secure your spot for our leading AI industry event with speakers from OpenAI, Anthropic, and Cohere. For a limited time, tickets are just $292 for an entire day of expert talks, workshops, and potent networking. Exhibit at TechCrunch Sessions: AI Secure your spot at TC Sessions: AI and show 1,200+ decision-makers what you've built — without the big spend. Available through May 9 or while tables last. Berkeley, CA | REGISTER NOW Plus, Page Zoom on Chrome on Android now lets you increase the size of the text you see without affecting the webpage layout. You can customize how much you want to zoom in and then choose to apply the preference to all of the pages you visit, or just certain ones. You can access this feature by tapping the three-dot menu in the top right corner of Chrome.


Android Authority
15-05-2025
- Entertainment
- Android Authority
'Yaaaaaay!' Google's latest accessibility tweaks include stretching out captions for emphasis
Ryan Haines / Android Authority TL;DR Google's celebrating Global Accessibility Awareness Day with some new functionality for Expressive Captions, TalkBack, and more. TalkBack is gaining support for follow-up questions, letting users ask things like what color objects are. Chrome also gets some handy upgrades, like OCR for image-based PDF files. Google's big developer conference is less than a week away at this point, with Google I/O 2025 kicking off next Tuesday, May 20. Some companies would sit on all their big announcements ahead of an event like that, hoping to aim for maximum fanfare by sharing them all at once, but that is not Google's MO this time around. Earlier this week we got the scoop on Material 3 Expressive thanks to Google's Android Show stream, and today we're learning about a bunch of new features aimed at making the company's solutions equally usable by everybody, just in time for Global Accessibility Awareness Day. Live Caption was already a fantastic tool for adding easy-to-read text to media that didn't natively offer the option, and last winter we saw Google supercharge it with Expressive Captions. In addition to better support for describing sound effects, Expressive Captions could do stuff like formatting text IN ALL CAPS to convey shouting. Now Google is further upgrading Expressive Captions still, giving it support for not just recognizing new sounds, but also learning to express drawn-out utterances by repeating letters. Yesssssss! Google Last year at I/O, Google shared that Android's TalkBack screen reader was getting a Gemini-powered upgrade, learning to describe photos using on-device processing. Now Google's making that a whole lot more conversational by letting users ask specific questions about photos, prompting TalkBack to follow up with more detail. Anything appearing on screen is fair game, so your questions don't have to be specifically about pictures. PDFs can be really hit-or-miss when it comes to accessibility, and while some are going to be nicely formatted text documents that are perfect for something like TalkBack to read to you, other times we're stuck with a scanned image. Thankfully, Google is doing something about just that, and shares that Chrome for desktop is picking up the ability to perform optical character recognition (OCR) on such image-based PDFs. More than just helping support screen readers, that also means it's going to be easy to copy text right out of them. Google Chrome's also getting some special attention on Android with improvements to how Page Zoom works. Google's fixing how zoomed-in webpages maintain proper formatting, and letting you save custom Page Zoom settings for different sites. Those are the biggies among Google's GAAD announcements, and amount to some very useful-sounding upgrades to the company's suite of accessibility solutions. Beyond these, Google also shares news about improved tools for students using Chromebooks and some of its speech-recognition efforts for non-standard speakers and across additional languages. You can check out the full details of those on Google's blog. Got a tip? Talk to us! Email our staff at Email our staff at news@ . You can stay anonymous or get credit for the info, it's your choice.