Latest news with #Qwen2.5-Omni-7B

Jordan News

3 days ago

Business
Jordan News

Xiaomi's Audio Model Reshapes the Landscape of Auditory Intelligence - Jordan News

Xiaomi has unveiled its new open-source intelligent audio model, MiDashengLM-7B, marking a major leap forward in its efforts to strengthen the technical backbone of its platforms, including smart home devices and electric vehicles. The model builds upon Xiaomi's foundational audio system, Xiaomi Dasheng. اضافة اعلان Advanced Architecture and Unified Sound Understanding According to Xiaomi's post on Chinese social media platform Weibo, MiDashengLM-7B represents a significant advancement in audio comprehension technologies. It utilizes a cutting-edge architecture that integrates the Xiaomi Dasheng platform as an audio encoder and the Qwen2.5-Omni-7B model as a decoder, creating a seamless system capable of understanding speech, environmental sounds, and music in a unified manner. Innovative Training for Deeper Acoustic Insight The model employs innovative training strategies that redefine audio scene interpretation, enabling it to capture deep auditory meanings, including speaker emotions, spatial echo, and other nuanced features often missed by traditional audio transformation models. High Benchmark Performance MiDashengLM-7B has demonstrated superior performance across 22 public evaluation datasets covering a wide range of tasks, including audio captioning, comprehension, audio-based Q&A, and speech recognition. Its first-token response time in single-pass inference is just a quarter of what leading models require. Moreover, it processes 20 times more audio samples simultaneously under the same GPU memory constraints, giving Xiaomi a clear edge in performance. Precision Audio Processing The model has outperformed notable systems like Whisper and Kimi-Audio on X-ARES benchmarks, especially in non-speech tasks. Dasheng is also used for audio generation tasks such as noise reduction and auditory enhancement. Notably, Xiaomi's Dasheng-Denoiser has already been integrated into major international conferences like Interspeech 2025, showcasing its ability to turn noisy speech into clean audio using targeted encoding and advanced audio restoration networks. Efficient Resource Utilization In terms of computational efficiency, MiDashengLM shows impressive inference speed. For instance, it can process 512 audio samples (30 seconds each) within an 80GB memory environment, while competing models struggle beyond 16 samples. This efficiency also enabled a reduction in audio encoder output frame rates from 25 Hz to 5 Hz, resulting in up to 80% less computational power required. Fully Open Dataset The model was built entirely using 100% publicly available data, amounting to 1.1 million hours spanning a wide range of fields—speech recognition, environmental sound understanding, music analysis, non-verbal behavior, and audio-based interactive tasks. Redefining Audio Data Processing One of MiDashengLM's key breakthroughs is its radical departure from traditional ASR (Automatic Speech Recognition) systems. Instead, it uses comprehensive descriptive alignment mechanisms that integrate all types of sound content—including speech, ambient sounds, and music. This shift reduced the loss of valuable data, which conventional ASR methods often discarded—sometimes up to 90% of the audio content. Real-World Applications and Offline Capabilities MiDashengLM has broad applications, such as providing custom feedback during voice training or language learning, offering real-time insights while driving, or serving as an intelligent assistant that can answer questions about environmental sounds. Xiaomi also plans to expand the model to support offline operation on edge devices, along with enhanced voice editing features based on natural language commands. Transparency and Open Collaboration In a move toward full transparency, Xiaomi revealed all dataset details, including distribution ratios from 77 sources, and the entire training process—from the encoder's initial pretraining to the final fine-tuning. The model is released under the Apache 2.0 license, allowing full freedom for commercial or academic use. Xiaomi has invited the developer community to contribute via GitHub, reinforcing its philosophy of openness, transparency, and collaborative innovation.

Xiaomi launches open-source voice model for cars, home devices

Economic Times

5 days ago

Automotive
Economic Times

Xiaomi launches open-source voice model for cars, home devices

Security purposes - A 24/7 sound monitoring and alert system for unusual ambient sound. Language and pronunciation - The AI model offers real-time pronunciation feedback inside cars for learning any language during a commute. Chinese major Xiaomi released an open-source voice model called MiDashengLM-7B on Monday, which is already functional in smart home systems and cars in to a report by Bloomberg News, the new MiDashengLM-7B foundational model is integrated with Alibaba Group's open-source Qwen2.5-Omni-7B report further added that the AI model has been trained on publicly available data and is released under the permissive Apache 2.0 Apache 2.0 license is a free software license that allows users to use, modify, and distribute the software for any purpose, including commercial use, with relatively few MiDashengLM-7B caters to not just speech inputs but also understands ambient sounds, background music, and environmental report by India Today says that the model is set to release 30 smart features across its product claims that the model can take up to 20 times more requests compared to similar models in this to a report by Reuters, the company plans to invest at least $6.93 billion in chip design over at least 10 years starting from development reaffirms Xiaomi's focus on technological innovation, especially on artificial intelligence (AI).

Time of India

5 days ago

Automotive
Time of India

Xiaomi launches open-source voice model for cars, home devices

Academy Empower your mind, elevate your skills Security purposes - A 24/7 sound monitoring and alert system for unusual ambient sound. Language and pronunciation - The AI model offers real-time pronunciation feedback inside cars for learning any language during a commute. Chinese major Xiaomi released an open-source voice model called MiDashengLM-7B on Monday, which is already functional in smart home systems and cars in to a report by Bloomberg News, the new MiDashengLM-7B foundational model is integrated with Alibaba Group's open-source Qwen2.5-Omni-7B report further added that the AI model has been trained on publicly available data and is released under the permissive Apache 2.0 Apache 2.0 license is a free software license that allows users to use, modify, and distribute the software for any purpose, including commercial use, with relatively few MiDashengLM-7B caters to not just speech inputs but also understands ambient sounds, background music, and environmental report by India Today says that the model is set to release 30 smart features across its product claims that the model can take up to 20 times more requests compared to similar models in this to a report by Reuters, the company plans to invest at least $6.93 billion in chip design over at least 10 years starting from development reaffirms Xiaomi's focus on technological innovation, especially on artificial intelligence (AI).

India Today

5 days ago

Automotive
India Today

Xiaomi's new AI model lets you control cars and home appliances with your voice: Details inside

Xiaomi has just introduced a new AI voice model called MiDashengLM-7B, and it's not just another lab experiment by the Chinese smartphone giant. This one's already working inside real-world devices — smart home systems and cars in China — and it's available as open source. The company says the model was trained using only publicly available data and is released under the permissive Apache 2.0 licence. That means developers and companies can use it freely, for both research and commercial projects. Of course, Xiaomi is playing a longer game here. By open-sourcing the model, it's not just showing off technical skills; the company is aiming to build a wider developer base. And in the current AI race, having a strong developer ecosystem could turn out to be the real per Xiaomi, the MiDashengLM-7B performs better than most other voice systems when it comes to speed and multitasking, which could make it appealing for anyone building AI features into everyday devices. What makes the AI model stand out is its wide range of capabilities. It isn't built just for speech. The model can also understand ambient sounds, background music, and environmental noises, all within the same system. This is thanks to a unified training strategy that uses Xiaomi's Dasheng audio encoder along with the Qwen2.5-Omni-7B decoder, originally developed by Alibaba. Together, this tech combo enables Xiaomi's AI to pick up on a lot more than just spoken words. It can, for example, detect abnormal sounds in your living room or respond to claps and snaps as control MiDashengLM-7B isn't a future-facing demo or some early prototype. Xiaomi says the model is already powering more than 30 smart features across its product lineup. In smart homes, for example, it's used for security, with 24/7 sound monitoring and alerts for unexpected noises. In cars, it supports voice commands and even offers features like real-time pronunciation feedback if you're practising a new language during your commute. There's even an underwater wake-up mode for some devices, which uses sound cues instead of traditional touch input. One big point Xiaomi is emphasising is that the model runs efficiently. The company claims it has significantly lower response delays and can handle 20 times more requests at the same time compared to similar models, without using extra memory. This matters because many devices using AI are still limited by their hardware. More efficient models mean better performance without needing constant internet access or powerful servers.- Ends

Xiaomi Unveils New AI Voice Model to Boost Auto, Home Tech

Hindustan Times

5 days ago

Automotive
Hindustan Times

Xiaomi Unveils New AI Voice Model to Boost Auto, Home Tech

Xiaomi Corp. on Monday released an open-source voice model to complement its automotive and home appliance technologies, further heating up the race to build AI tools for more than just text. The new MiDashengLM-7B is based on Xiaomi's foundational voice model, which has been deployed in cars and smart home gadgets, with integration of Alibaba Group Holding Ltd.'s open-source Qwen2.5-Omni-7B. The Beijing-based phone and auto maker detailed the advancements and provided benchmarks in a post on its WeChat account. Xiaomi has been aggressively pursuing new growth drivers outside of its core smartphone business, with electric vehicles now fast becoming one of its priority business areas. At the same time, investing in the development of artificial intelligence has grown into an overriding priority across China's tech sector, and many of the leading companies have opted to make their work open source to secure customers. Also read Looking for a smartphone? To check mobile finder click here. Major Chinese internet companies from Alibaba to Tencent Holdings Ltd. have released various models that can handle images, video and sound in recent months to better compete with the likes of OpenAI's Sora. Both US President Donald Trump and Chinese leader Xi Jinping have emphasized the need for their countries to secure a leading position in the AI race. More stories like this are available on ©2025 Bloomberg L.P.

Latest news with #Qwen2.5-Omni-7B

Xiaomi's Audio Model Reshapes the Landscape of Auditory Intelligence - Jordan News

Xiaomi launches open-source voice model for cars, home devices

Xiaomi launches open-source voice model for cars, home devices

Xiaomi's new AI model lets you control cars and home appliances with your voice: Details inside

Xiaomi Unveils New AI Voice Model to Boost Auto, Home Tech

Get Started Now: Download the App