logo
Xiaomi's Audio Model Reshapes the Landscape of Auditory Intelligence - Jordan News

Xiaomi's Audio Model Reshapes the Landscape of Auditory Intelligence - Jordan News

Jordan News4 days ago
Xiaomi has unveiled its new open-source intelligent audio model, MiDashengLM-7B, marking a major leap forward in its efforts to strengthen the technical backbone of its platforms, including smart home devices and electric vehicles. The model builds upon Xiaomi's foundational audio system, Xiaomi Dasheng. اضافة اعلان Advanced Architecture and Unified Sound Understanding According to Xiaomi's post on Chinese social media platform Weibo, MiDashengLM-7B represents a significant advancement in audio comprehension technologies. It utilizes a cutting-edge architecture that integrates the Xiaomi Dasheng platform as an audio encoder and the Qwen2.5-Omni-7B model as a decoder, creating a seamless system capable of understanding speech, environmental sounds, and music in a unified manner. Innovative Training for Deeper Acoustic Insight The model employs innovative training strategies that redefine audio scene interpretation, enabling it to capture deep auditory meanings, including speaker emotions, spatial echo, and other nuanced features often missed by traditional audio transformation models. High Benchmark Performance MiDashengLM-7B has demonstrated superior performance across 22 public evaluation datasets covering a wide range of tasks, including audio captioning, comprehension, audio-based Q&A, and speech recognition. Its first-token response time in single-pass inference is just a quarter of what leading models require. Moreover, it processes 20 times more audio samples simultaneously under the same GPU memory constraints, giving Xiaomi a clear edge in performance. Precision Audio Processing The model has outperformed notable systems like Whisper and Kimi-Audio on X-ARES benchmarks, especially in non-speech tasks. Dasheng is also used for audio generation tasks such as noise reduction and auditory enhancement. Notably, Xiaomi's Dasheng-Denoiser has already been integrated into major international conferences like Interspeech 2025, showcasing its ability to turn noisy speech into clean audio using targeted encoding and advanced audio restoration networks. Efficient Resource Utilization In terms of computational efficiency, MiDashengLM shows impressive inference speed. For instance, it can process 512 audio samples (30 seconds each) within an 80GB memory environment, while competing models struggle beyond 16 samples. This efficiency also enabled a reduction in audio encoder output frame rates from 25 Hz to 5 Hz, resulting in up to 80% less computational power required. Fully Open Dataset The model was built entirely using 100% publicly available data, amounting to 1.1 million hours spanning a wide range of fields—speech recognition, environmental sound understanding, music analysis, non-verbal behavior, and audio-based interactive tasks. Redefining Audio Data Processing One of MiDashengLM's key breakthroughs is its radical departure from traditional ASR (Automatic Speech Recognition) systems. Instead, it uses comprehensive descriptive alignment mechanisms that integrate all types of sound content—including speech, ambient sounds, and music. This shift reduced the loss of valuable data, which conventional ASR methods often discarded—sometimes up to 90% of the audio content. Real-World Applications and Offline Capabilities MiDashengLM has broad applications, such as providing custom feedback during voice training or language learning, offering real-time insights while driving, or serving as an intelligent assistant that can answer questions about environmental sounds. Xiaomi also plans to expand the model to support offline operation on edge devices, along with enhanced voice editing features based on natural language commands. Transparency and Open Collaboration In a move toward full transparency, Xiaomi revealed all dataset details, including distribution ratios from 77 sources, and the entire training process—from the encoder's initial pretraining to the final fine-tuning.
The model is released under the Apache 2.0 license, allowing full freedom for commercial or academic use. Xiaomi has invited the developer community to contribute via GitHub, reinforcing its philosophy of openness, transparency, and collaborative innovation.
Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Xiaomi's Audio Model Reshapes the Landscape of Auditory Intelligence - Jordan News
Xiaomi's Audio Model Reshapes the Landscape of Auditory Intelligence - Jordan News

Jordan News

time4 days ago

  • Jordan News

Xiaomi's Audio Model Reshapes the Landscape of Auditory Intelligence - Jordan News

Xiaomi has unveiled its new open-source intelligent audio model, MiDashengLM-7B, marking a major leap forward in its efforts to strengthen the technical backbone of its platforms, including smart home devices and electric vehicles. The model builds upon Xiaomi's foundational audio system, Xiaomi Dasheng. اضافة اعلان Advanced Architecture and Unified Sound Understanding According to Xiaomi's post on Chinese social media platform Weibo, MiDashengLM-7B represents a significant advancement in audio comprehension technologies. It utilizes a cutting-edge architecture that integrates the Xiaomi Dasheng platform as an audio encoder and the Qwen2.5-Omni-7B model as a decoder, creating a seamless system capable of understanding speech, environmental sounds, and music in a unified manner. Innovative Training for Deeper Acoustic Insight The model employs innovative training strategies that redefine audio scene interpretation, enabling it to capture deep auditory meanings, including speaker emotions, spatial echo, and other nuanced features often missed by traditional audio transformation models. High Benchmark Performance MiDashengLM-7B has demonstrated superior performance across 22 public evaluation datasets covering a wide range of tasks, including audio captioning, comprehension, audio-based Q&A, and speech recognition. Its first-token response time in single-pass inference is just a quarter of what leading models require. Moreover, it processes 20 times more audio samples simultaneously under the same GPU memory constraints, giving Xiaomi a clear edge in performance. Precision Audio Processing The model has outperformed notable systems like Whisper and Kimi-Audio on X-ARES benchmarks, especially in non-speech tasks. Dasheng is also used for audio generation tasks such as noise reduction and auditory enhancement. Notably, Xiaomi's Dasheng-Denoiser has already been integrated into major international conferences like Interspeech 2025, showcasing its ability to turn noisy speech into clean audio using targeted encoding and advanced audio restoration networks. Efficient Resource Utilization In terms of computational efficiency, MiDashengLM shows impressive inference speed. For instance, it can process 512 audio samples (30 seconds each) within an 80GB memory environment, while competing models struggle beyond 16 samples. This efficiency also enabled a reduction in audio encoder output frame rates from 25 Hz to 5 Hz, resulting in up to 80% less computational power required. Fully Open Dataset The model was built entirely using 100% publicly available data, amounting to 1.1 million hours spanning a wide range of fields—speech recognition, environmental sound understanding, music analysis, non-verbal behavior, and audio-based interactive tasks. Redefining Audio Data Processing One of MiDashengLM's key breakthroughs is its radical departure from traditional ASR (Automatic Speech Recognition) systems. Instead, it uses comprehensive descriptive alignment mechanisms that integrate all types of sound content—including speech, ambient sounds, and music. This shift reduced the loss of valuable data, which conventional ASR methods often discarded—sometimes up to 90% of the audio content. Real-World Applications and Offline Capabilities MiDashengLM has broad applications, such as providing custom feedback during voice training or language learning, offering real-time insights while driving, or serving as an intelligent assistant that can answer questions about environmental sounds. Xiaomi also plans to expand the model to support offline operation on edge devices, along with enhanced voice editing features based on natural language commands. Transparency and Open Collaboration In a move toward full transparency, Xiaomi revealed all dataset details, including distribution ratios from 77 sources, and the entire training process—from the encoder's initial pretraining to the final fine-tuning. The model is released under the Apache 2.0 license, allowing full freedom for commercial or academic use. Xiaomi has invited the developer community to contribute via GitHub, reinforcing its philosophy of openness, transparency, and collaborative innovation.

U.S. Nuclear Weapons Agency Hit by Widespread Cyberattack - Jordan News
U.S. Nuclear Weapons Agency Hit by Widespread Cyberattack - Jordan News

Jordan News

time7 days ago

  • Jordan News

U.S. Nuclear Weapons Agency Hit by Widespread Cyberattack - Jordan News

Bloomberg has reported that the National Nuclear Security Administration (NNSA)—a division of the U.S. Department of Energy responsible for the design and maintenance of the country's nuclear arsenal—has fallen victim to a significant cyberattack. The breach exploited a critical zero-day vulnerability in Microsoft's SharePoint platform. اضافة اعلان Details of the Attack and Its Impact According to a Department of Energy spokesperson, the attack began on Friday, July 18. Despite the seriousness of the vulnerability, a source familiar with the investigation confirmed that the attackers did not gain access to any classified information. The department stated that the damage was very limited, affecting only a small number of on-premises servers running SharePoint. The limited impact was attributed to the department's reliance on Microsoft's M365 cloud services and advanced cybersecurity infrastructure. Perpetrators and Scope of the Breach Microsoft has attributed the attack to a state-sponsored hacking group linked to the Chinese government. The group reportedly exploited vulnerabilities in SharePoint to infiltrate systems, gain control, and steal security credentials and access tokens. According to Google's Threat Analysis Group, the exploited vulnerability is considered 'a dream for ransomware operators' due to its ability to provide persistent unauthorized access and evade future security patches. The attack was not limited to the NNSA. Other victims included the U.S. Department of Education, the Florida Department of Revenue, and several government systems in countries across the Middle East and Europe. Response Measures On Monday, Microsoft released a new security update to address the active attacks targeting on-premises SharePoint servers. The company emphasized that cloud-based servers were not affected.

The global economic battle between the U.S. and China
The global economic battle between the U.S. and China

Ammon

time03-08-2025

  • Ammon

The global economic battle between the U.S. and China

Raad Mahmoud Al-Tal More than six months after the latest round of tensions between the United States and China, it's clear that this is more than just a trade dispute. It has become a larger fight over who will lead the world economy and technology in the future. The U.S. wants to slow down China's rise by adding trade and tech restrictions. China, in return, is trying to protect its economy through stimulus plans, finding new markets, and becoming more self-reliant. In the second quarter of 2025, China's economy grew by 5.2%. This was slower than the 5.4% in the first quarter, but still better than experts expected. However, some signs of weakness are showing—exports are slowing down, consumer confidence is dropping, and prices are falling. On the other hand, the U.S. economy grew by 3.0% in the same quarter, supported by strong consumer spending and a steady job market. Since 2018, the U.S. has put tariffs on hundreds of billions of dollars' worth of Chinese goods. China hit back with its own tariffs, especially on American farm products. As a result, many companies started moving their factories to other countries like Vietnam, India, and Mexico. China responded by boosting its own industries especially in important areas like semiconductors and green energy through a strategy called 'Made in China 2025.' The trade war has affected many industries. In the U.S., companies that rely on Chinese parts like tech and car firms faced higher costs. Farmers lost business because of China's tariffs, which forced the U.S. government to offer billions of dollars in financial help. China also suffered from weaker demand for its goods abroad, but tried to make up for it by building more infrastructure, cutting interest rates, and giving tax breaks to encourage innovation. Now, both countries are trying to reduce their economic ties to each other. For example, China's share of U.S. imports dropped from 21% in 2017 to less than 14% by mid-2025. This shows that U.S. efforts to reduce dependence on China are working to some extent. Looking to the future, there are three possible paths. First, tensions could rise, which would hurt both economies and the global market. Second, the two sides might reach short-term deals without fully solving the problem. Third, they could continue slowly separating their economies, which could reshape global trade and create new chances for other countries. Right now, there's no clear winner. The U.S. has strong financial tools to handle economic pressure, and China is showing flexibility by adjusting its policies. But the conflict is expensive for both, and the effects are being felt around the world. This trade war has become a long-term strategic competition that will help decide who leads the global economy and technology in the years ahead.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store