Latest news with #videoanalysis


Times of Oman
9 hours ago
- Times of Oman
Google Gemini now supports video uploads for analysis
Washington: Google has rolled out an exciting update to its Gemini app, allowing users to upload videos for analysis. This feature enables users to ask questions about video content or have Gemini describe clips, as per The Verge. Although the update hasn't been universally rolled out yet, users on iOS and Android devices may already have access to this functionality. Key features of video upload and analysis include: - Video Analysis: Gemini can analyse uploaded video files and provide insights or answers to user queries. - Question Answering: Users can ask questions about specific video content, such as identifying objects, actions, or text within the video. - Video Player Interface: The uploaded video appears above the chat interface, allowing users to watch the clip again if needed. Availability and limitations of the feature include: - Platform Support: The video upload feature is currently available on iOS and Android devices, with varying availability across accounts and devices. - Web Support: This feature is not yet live on the web version of Gemini, with users encountering a "File type unsupported" message. - Camera Limitation: The built-in Gemini camera still doesn't support capturing video.

Associated Press
9 hours ago
- Business
- Associated Press
AVer Europe and DeepNeuronic Announce Strategic Partnership to Deliver AI-Driven Video Solutions
ROTTERDAM, NETHERLANDS, June 18, 2025 / / -- AVer Information Europe B.V., a leading provider of advanced video conferencing, education and ProAV solutions, is proud to announce a new strategic partnership with DeepNeuronic, a pioneering AI tech startup dedicated to developing innovative solutions that harness the power of AI and human learning, focusing on detection of dangerous activities. This new collaboration marks the beginning of a powerful synergy between cutting-edge camera technology and intelligent AI-driven video analysis. By combining AVer's state-of-the-art imaging capabilities with DeepNeuronic's sophisticated detection software, both companies aim to co-develop solutions that will support safer environments and more responsive security systems across a wide range of sectors. Founded in 2021, DeepNeuronic specializes in applying deep neural systems for automatic vision, enabling law enforcement agencies, private security firms, and organizations to quickly identify and respond to public crimes and threatening behaviour. The partnership with AVer Europe will empower DeepNeuronic with the high-quality video inputs necessary for optimal AI performance—especially in mission-critical environments. 'This partnership is a perfect match,' said Jose Rincon, Head of Product Management at AVer Europe. 'DeepNeuronic brings an exceptional AI platform to the table, and with AVer's premium camera technology, we can offer a smarter, more effective solution to our shared customers. We are excited to begin this journey together.' While this announcement marks the beginning of their collaboration, both companies have plans to showcase joint solutions in the near future. 'We see immense potential in this partnership,' said Rene Buhay, SVP Sales & Marketing at AVer Europe. 'As AI becomes more central to public safety and surveillance, combining our technologies will drastically enhance the quality of insights & provide effectiveness.' This partnership signals the start of a broader strategic alignment between the two companies. 'Our partnership with AVer enables us to bring advanced AI-driven video analytics into healthcare environments where safety, responsiveness, and operational efficiency are critical. By combining our intelligent surveillance technology with AVer's reliable hardware, we can help hospitals and care facilities detect incidents and abnormal activities in real time— from patient falls to unauthorized access — while respecting privacy and complying with strict data protection standards. This collaboration is a significant step toward smarter, safer healthcare.' — Vasco Lopes, CEO of DeepNeuronic While the initial phase focuses on collaboration and exploration, both AVer Europe and DeepNeuronic are committed to driving innovation that will help shape the future of intelligent video solutions. About AVer Europe AVer Europe is a leading provider of video conferencing, education technology, and Pro AV solutions. With a strong focus on innovation and quality, AVer delivers cutting-edge products designed to enhance communication and collaboration across various industries. For more information, visit About DeepNeuronic DeepNeuronic is a Portuguese AI startup specializing in real-time video analytics. Founded in 2021 and based in Covilhã, the company transforms existing IP cameras into intelligent monitoring systems capable of detecting threats, abnormal behavior, and security incidents with high accuracy. DeepNeuronic serves sectors like smart cities, transportation, healthcare, and critical infrastructure — all while ensuring GDPR compliance. Its AI technology reduces false alarms by up to 98% and enables proactive, cost-effective security without the need for new hardware. Core leadership includes Vasco Lopes (CEO), Bruno Degardin (CTO), and Vítor Crespo (CSO) Rene Buhay AVer Information Europe B.V. +1 408-457-3338 email us here Visit us on social media: LinkedIn YouTube Legal Disclaimer: EIN Presswire provides this news content 'as is' without warranty of any kind. We do not accept any responsibility or liability for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this article. If you have any complaints or copyright issues related to this article, kindly contact the author above.


The Verge
9 hours ago
- The Verge
Posted Jun 18, 2025 at 5:50 AM EDT
Gemini is getting video uploads. Yesterday Google announced updates to its Gemini models, including a new 2.5 Flash-Lite, but didn't mention a bigger change: the Gemini app apparently now lets you upload videos for analysis, asking Gemini to describe clips or answer questions about video content. I say 'apparently' because the option hasn't appeared on our devices yet, though 9to5Google says availability 'varies,' so you might be lucky. It's only for iOS and Android, with no web support yet.


Android Authority
11 hours ago
- Android Authority
Gemini's new video analysis feature is here, but there's more coming (APK teardown)
Ryan Haines / Android Authority TL;DR The latest version of the Google app now allows users to upload up to five minutes of video into Gemini for analysis. Gemini has supported video analysis through a YouTube integration, but rivals like ChatGPT have offered the option to upload videos directly to the AI assistant, so Gemini is finally catching up. Additionally, a future update will let users directly record videos within the app, expanding current photo-only capture, as spotted through our APK teardown. Google has been steadily adding features to Gemini to help it compete better against other AI-based digital assistants. ChatGPT already allows users to upload videos for analysis, but Gemini lacks this handy feature. We've long spotted that the ability to analyze videos is coming to Gemini, and we even presented an early demo. With Google app v16.23.69, Google is finally rolling out the ability to upload videos to Gemini for analysis. Authority Insights story on Android Authority. Discover You're reading anstory on Android Authority. Discover Authority Insights for more exclusive reports, app teardowns, leaks, and in-depth tech coverage you won't find anywhere else. An APK teardown helps predict features that may arrive on a service in the future based on work-in-progress code. However, it is possible that such predicted features may not make it to a public release. Videos up to five minutes long (combined) can be attached, which should suffice for most casual use cases. If you need to analyze a longer video, you can upload it onto YouTube as an unlisted video and then paste the link in Gemini to analyze it as a YouTube video. This feature is gradually rolling out to users, so you may need to wait to see it on your end. To check if you have the feature live, you can click on the Plus button in the Gemini text box, select Gallery or Files, and see if you can select any video files. If video files are grayed out and cannot be uploaded, you don't have the feature available to you just yet. While the video analysis feature is great, it doesn't integrate with the Camera option, which is visible in the attachment sheet. As a result, while you can click a photo from within Gemini to attach it to a prompt, you cannot record a video. Google is aware of this limitation, as code within this app version fixes this oversight. We managed to activate the feature to give you an early look at it: AssembleDebug / Android Authority As you can see, Gemini's camera viewfinder will soon allow users to easily switch between clicking photos and recording videos to attach to their prompt. Here's a video demo of the upcoming feature in action: This video upgrade to the camera viewfinder within Gemini is not currently available to users. Expanding the feature to cover videos makes sense for Gemini, so we hope to see it rolled out soon. We'll keep you updated when we learn more. Got a tip? Talk to us! Email our staff at Email our staff at news@ . You can stay anonymous or get credit for the info, it's your choice.

Associated Press
10-06-2025
- Science
- Associated Press
PolyU develops novel multi-modal agent to facilitate long video understanding by AI, accelerating development of generative AI-assisted video analysis
HONG KONG SAR - Media OutReach Newswire - 10 June 2025 - While Artificial Intelligence (AI) technology is evolving rapidly, AI models still struggle with understanding long videos. A research team from The Hong Kong Polytechnic University (PolyU) has developed a novel video-language agent, VideoMind, that enables AI models to perform long video reasoning and question-answering tasks by emulating humans' way of thinking. The VideoMind framework incorporates an innovative Chain-of-Low-Rank Adaptation (LoRA) strategy to reduce the demand for computational resources and power, advancing the application of generative AI in video analysis. The findings have been submitted to the world-leading AI conferences. A research team led by Prof. Changwen Chen, Interim Dean of the PolyU Faculty of Computer and Mathematical Sciences and Chair Professor of Visual Computing, has developed a novel video-language agent VideoMind that allows AI models to perform long video reasoning and question-answering tasks by emulating humans' way of thinking. The VideoMind framework incorporates an innovative Chain-of-LoRA strategy to reduce the demand for computational resources and power, advancing the application of generative AI in video analysis. Videos, especially those longer than 15 minutes, carry information that unfolds over time, such as the sequence of events, causality, coherence and scene transitions. To understand the video content, AI models therefore need not only to identify the objects present, but also take into account how they change throughout the video. As visuals in videos occupy a large number of tokens, video understanding requires vast amounts of computing capacity and memory, making it difficult for AI models to process long videos. Prof. Changwen CHEN, Interim Dean of the PolyU Faculty of Computer and Mathematical Sciences and Chair Professor of Visual Computing, and his team have achieved a breakthrough in research on long video reasoning by AI. In designing VideoMind, they made reference to a human-like process of video understanding, and introduced a role-based workflow. The four roles included in the framework are: the Planner, to coordinate all other roles for each query; the Grounder, to localise and retrieve relevant moments; the Verifier, to validate the information accuracy of the retrieved moments and select the most reliable one; and the Answerer, to generate the query-aware answer. This progressive approach to video understanding helps address the challenge of temporal-grounded reasoning that most AI models face. Another core innovation of the VideoMind framework lies in its adoption of a Chain-of-LoRA strategy. LoRA is a finetuning technique emerged in recent years. It adapts AI models for specific uses without performing full-parameter retraining. The innovative chain-of-LoRA strategy pioneered by the team involves applying four lightweight LoRA adapters in a unified model, each of which is designed for calling a specific role. With this strategy, the model can dynamically activate role-specific LoRA adapters during inference via self-calling to seamlessly switch among these roles, eliminating the need and cost of deploying multiple models while enhancing the efficiency and flexibility of the single model. VideoMind is open source on GitHub and Huggingface. Details of the experiments conducted to evaluate its effectiveness in temporal-grounded video understanding across 14 diverse benchmarks are also available. Comparing VideoMind with some state-of-the-art AI models, including GPT-4o and Gemini 1.5 Pro, the researchers found that the grounding accuracy of VideoMind outperformed all competitors in challenging tasks involving videos with an average duration of 27 minutes. Notably, the team included two versions of VideoMind in the experiments: one with a smaller, 2 billion (2B) parameter model, and another with a bigger, 7 billion (7B) parameter model. The results showed that, even at the 2B size, VideoMind still yielded performance comparable with many of the other 7B size models. Prof. Chen said, 'Humans switch among different thinking modes when understanding videos: breaking down tasks, identifying relevant moments, revisiting these to confirm details and synthesising their observations into coherent answers. The process is very efficient with the human brain using only about 25 watts of power, which is about a million times lower than that of a supercomputer with equivalent computing power. Inspired by this, we designed the role-based workflow that allows AI to understand videos like human, while leveraging the chain-of-LoRA strategy to minimise the need for computing power and memory in this process.' AI is at the core of global technological development. The advancement of AI models is however constrained by insufficient computing power and excessive power consumption. Built upon a unified, open-source model Qwen2-VL and augmented with additional optimisation tools, the VideoMind framework has lowered the technological cost and the threshold for deployment, offering a feasible solution to the bottleneck of reducing power consumption in AI models. Prof. Chen added, 'VideoMind not only overcomes the performance limitations of AI models in video processing, but also serves as a modular, scalable and interpretable multimodal reasoning framework. We envision that it will expand the application of generative AI to various areas, such as intelligent surveillance, sports and entertainment video analysis, video search engines and more.' Hashtag: #PolyU #AI #LLMs #VideoAnalysis #IntelligentSurveillance #VideoSearch The issuer is solely responsible for the content of this announcement.