logo
#

Latest news with #QwenChat

Alibaba unveils new open source AI that creates images with perfect text
Alibaba unveils new open source AI that creates images with perfect text

India Today

time5 days ago

  • Business
  • India Today

Alibaba unveils new open source AI that creates images with perfect text

Alibaba has released a new open-source image generation model called Qwen-Image that sets itself apart by accurately rendering complex and multilingual text within images, a task where many other AI tools still struggle. Developed by Alibaba's Qwen Team, Qwen-Image is designed to handle everything from handwritten poetry and bilingual posters to e-commerce product labels and classroom diagrams, all while maintaining high-quality, readable text. The model supports both alphabetic scripts, like English, and logographic ones, like Chinese, making it especially useful in multilingual can try out Qwen-Image via the Qwen Chat website by switching to the "Image Generation" mode. The model has also been released under the Apache 2.0 licence, meaning businesses and developers can use, modify, and distribute it — even for commercial purposes — as long as they include the proper training data includes billions of image-text pairs sourced from natural scenes, human portraits, artistic posters, and synthetically generated text data. Interestingly, all the synthetic data used for training was generated in-house by Alibaba, and no AI-generated images from other models were included. This approach helped the model learn to handle rare or complex characters, especially in Chinese. The model was trained in stages, starting with simple captioned images and gradually moving to more complex layouts and dense multilingual text. This curriculum-style training, according to Alibaba, helped Qwen-Image generalise better across various the hood, Qwen-Image combines three main components:-Qwen2.5-VL, a multimodal language model for understanding context-A VAE encoder/decoder, optimised for high-resolution layouts-MMDiT, a diffusion model with a special encoding system for spatial alignmentThese elements work together to produce images that are not only visually appealing but also accurate in terms of text placement and claims that Qwen-Image has been tested against several industry benchmarks for text clarity, layout precision, and prompt-following ability. On the AI Arena public leaderboard, which uses human evaluations to rank AI image models, Qwen-Image reportedly holds third place overall currently and is the highest-ranked open-source model.- EndsMust Watch

Alibaba Cloud Releases Qwen2.5-Omni-7B: An End-to-end Multimodal AI Model
Alibaba Cloud Releases Qwen2.5-Omni-7B: An End-to-end Multimodal AI Model

Mid East Info

time10-04-2025

  • Business
  • Mid East Info

Alibaba Cloud Releases Qwen2.5-Omni-7B: An End-to-end Multimodal AI Model

Alibaba Cloud has launched Qwen2.5-Omni-7B, a unified end-to-end multimodal model in the Qwen series. Uniquely designed for comprehensive multimodal perception, it can process diverse inputs, including text, images, audio, and videos, while generating real-time text and natural speech responses. This sets a new standard for optimal deployable multimodal AI for edge devices like mobile phones and laptops. Despite its compact 7B-parameter design, Qwen2.5-Omni-7B delivers uncompromised performance and powerful multimodal capabilities. This unique combination makes it the perfect foundation for developing agile, cost-effective AI agents that deliver tangible value – especially intelligent voice applications. For example, the model could be leveraged to transform lives by helping visually impaired users navigate environments through real-time audio descriptions, offer step-by-step cooking guidance by analyzing video ingredients, or power intelligent customer service dialogues that really understand customer needs. The model is now open-sourced on Hugging Face and GitHub, with additional access via Qwen Chat and Alibaba Cloud's open-source community ModelScope. Over the past years, Alibaba Cloud has made over 200 generative AI models open-source. High Performance Driven by Innovative Architecture: Qwen2.5-Omni-7B delivers remarkable performance across all modalities, rivaling specialized single-modality models of comparable size. Notably, it sets a new benchmark in real-time voice interaction, natural and robust speech generation, and end-to-end speech instruction following. Its efficiency and high performance stem from its innovative architecture, including Thinker-Talker Architecture, which separates text generation (through Thinker) and speech synthesis (through Talker) to minimize interference among different modalities for high-quality output; TMRoPE (Time-aligned Multimodal RoPE), a position embedding technique to better synchronize the video inputs with audio for coherent content generation; and Block-wise Streaming Processing, which enables low-latency audio responses for seamless voice interactions. Outstanding Performance Despite Compact Size: Qwen2.5-Omni-7B was pre-trained on a vast, diverse dataset, including image-text, video-text, video-audio, audio-text, and text data, ensuring robust performance across tasks. With the innovative architecture and high-quality pre-trained dataset, the model excels in following voice command, achieving performance levels comparable to pure text input. For tasks that involve integrating multiple modalities, such as those evaluated in OmniBench – a benchmark that assesses models' ability to recognize, interpret, and reason across visual, acoustic, and textual inputs – Qwen2.5-Omni achieves state-of-the-art performance. Qwen2.5-Omni-7B also demonstrates high performance on robust speech understanding and generation capabilities through in-context learning (ICL). Additionally, after reinforcement learning (RL) optimization, Qwen2.5-Omni-7B showed significant improvements in generation stability, with marked reductions in attention misalignment, pronunciation errors, and inappropriate pauses during speech response.

Alibaba shares surge after it unveils reasoning model
Alibaba shares surge after it unveils reasoning model

Yahoo

time06-03-2025

  • Business
  • Yahoo

Alibaba shares surge after it unveils reasoning model

By Che Pan and Brenda Goh BEIJING (Reuters) - The Hong Kong-listed shares of Alibaba Group surged more than 8% on Thursday following the release of a new reasoning model that the company said was on par with global hit DeepSeek's R1. Qwen, the e-commerce leader's artificial intelligence unit, said on X that its QwQ-32B, with 32 billion parameters, can achieve performance comparable to DeepSeek's R1 model, which boasts 671 billion parameters. The announcement came a day after the Chinese government pledged increased support for industries including artificial intelligence, humanoid robots and 6G telecom, as AI models see increasing adoption by government agencies and smaller companies in China. Alibaba said the new model is accessible via its chatbot service, Qwen Chat, for which users can choose various Qwen models including Qwen2.5-Max, the most powerful language model in the Qwen series. The firm said the QwQ-32B demonstrated capabilities in mathematical reasoning, coding and general problem-solving in benchmark tests, performing close to top models such as OpenAI's o1 mini and DeepSeek's R1. DeepSeek has emerged as the new poster child of China's AI prowess, rivaling top models from OpenAI for a fraction of its costs with less powerful computing. Analysts said initiatives laid out by the government will expand usage of AI in China. "China is rapidly building an application-driven AI ecosystem that isn't just about research — it's about immediate, tangible economic impact," said Sun Wei, principal AI analyst at research company Counterpoint. Alibaba's shares surged 8% to HK$140.5 a share. Sign in to access your portfolio

Alibaba shares surge after it unveils reasoning model
Alibaba shares surge after it unveils reasoning model

Reuters

time06-03-2025

  • Business
  • Reuters

Alibaba shares surge after it unveils reasoning model

BEIJING, March 6 (Reuters) - The Hong Kong-listed shares of Alibaba Group ( opens new tab surged more than 8% on Thursday following the release of a new reasoning model that the company said was on par with global hit DeepSeek's R1. Qwen, the e-commerce leader's artificial intelligence unit, said on X that its QwQ-32B, with 32 billion parameters, can achieve performance comparable to DeepSeek's R1 model, which boasts 671 billion parameters. The announcement came a day after the Chinese government pledged increased support for industries including artificial intelligence, humanoid robots and 6G telecom, as AI models see increasing adoption by government agencies and smaller companies in China. Alibaba said the new model is accessible via its chatbot service, Qwen Chat, for which users can choose various Qwen models including Qwen2.5-Max, the most powerful language model in the Qwen series. The firm said the QwQ-32B demonstrated capabilities in mathematical reasoning, coding and general problem-solving in benchmark tests, performing close to top models such as OpenAI's o1 mini and DeepSeek's R1. DeepSeek has emerged as the new poster child of China's AI prowess, rivaling top models from OpenAI for a fraction of its costs with less powerful computing. Analysts said initiatives laid out by the government will expand usage of AI in China. "China is rapidly building an application-driven AI ecosystem that isn't just about research — it's about immediate, tangible economic impact," said Sun Wei, principal AI analyst at research company Counterpoint. Alibaba's shares surged 8% to HK$140.5 a share.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store