logo
#

Latest news with #Qwen3-235B-A22B

Alibaba launches Qwen3, open-source AI for global developers
Alibaba launches Qwen3, open-source AI for global developers

Techday NZ

time05-05-2025

  • Business
  • Techday NZ

Alibaba launches Qwen3, open-source AI for global developers

Alibaba has introduced Qwen3, the latest open-sourced large language model series generation. The Qwen3 series includes six dense models and two Mixture-of-Experts (MoE) models, which aim to offer developers flexibility to build advanced applications across mobile devices, smart glasses, autonomous vehicles, and robotics. All models in the Qwen3 family—spanning dense models with 0.6 billion to 32 billion parameters and MoE models with 30 billion (3 billion active) and 235 billion (22 billion active) parameters—are now open-sourced and accessible globally. Qwen3 is Alibaba's first release of hybrid reasoning models. These models blend conventional large language model capabilities with more advanced and dynamic reasoning. Qwen3 can transition between "thinking mode" for complex multi-step tasks such as mathematics, coding, and logical deduction, and "non-thinking mode" for rapid, more general-purpose responses. For developers using the Qwen3 API, the model provides control over the duration of its "thinking mode," which can extend up to 38,000 tokens. This is intended to enable a tailored balance between intelligence and computational efficiency. The Qwen3-235B-A22B MoE model is designed to lower deployment costs compared to other models in its class. Qwen3 has been trained on a dataset comprising 36 trillion tokens, double the size of the dataset used to train its predecessor, Qwen2.5. Alibaba reports that this expanded training has improved reasoning, instruction following, tool use, and multilingual tasks. Among Qwen3's features is support for 119 languages and dialects. The model is said to deliver high performance in translation and multilingual instruction-following. Advanced agent integration is supported with native compatibility for the Model Context Protocol (MCP) and robust function-calling capabilities. These features place Qwen3 among open-source models targeting complex agent-based tasks. Regarding benchmarking, Alibaba states that Qwen3 surpasses previous Qwen models—including QwQ in thinking mode and Qwen2.5 in non-thinking mode—on mathematics, coding, and logical reasoning tests. The model also aims to provide more natural experiences in creative writing, role-playing, and multi-turn dialogue, supporting more engaging conversations. Alibaba reports strong performance by Qwen3 models across several benchmarks, including AIME25 for mathematical reasoning, LiveCodeBench for coding proficiency, BFCL for tools and function-calling, and Arena-Hard for instruction-tuned large language models. The development of Qwen3's hybrid reasoning capacity involved a four-stage training process: long chain-of-thought cold start, reasoning-based reinforcement learning, thinking mode fusion, and general reinforcement learning. Qwen3 models are now freely available on digital platforms including Hugging Face, Github, and ModelScope. An API is scheduled for release via Alibaba's Model Studio, the company's development platform for AI models. Qwen3 is also integrated into Alibaba's AI super assistant application, Quark. The Qwen model family has attracted over 300 million downloads globally. Developers have produced over 100,000 derivative models based on Qwen on Hugging Face, which Alibaba claims ranks the series among the most widely adopted open-source AI models worldwide.

Alibaba Introduces Qwen3, Setting New Benchmark in Open-Source AI with Hybrid Reasoning - Middle East Business News and Information
Alibaba Introduces Qwen3, Setting New Benchmark in Open-Source AI with Hybrid Reasoning - Middle East Business News and Information

Mid East Info

time30-04-2025

  • Business
  • Mid East Info

Alibaba Introduces Qwen3, Setting New Benchmark in Open-Source AI with Hybrid Reasoning - Middle East Business News and Information

April 2025 – Alibaba has launched Qwen3, the latest generation of its open-sourced large language model (LLM) family, setting a new benchmark for AI innovation. The Qwen3 series features six dense models and two Mixture-of-Experts (MoE) models, offering developers flexibility to build next-generation applications across mobile devices, smart glasses, autonomous vehicles, robotics and beyond. All Qwen3 models – including dense models (0.6B, 1.7B, 4B, 8B, 14B, and 32B parameters) and MoE models (30B with 3B active, and 235B with 22B active) – are now open sourced and available globally. Hybrid Reasoning Combining Thinking and Non-thinking Modes Qwen3 marks Alibaba's debut of hybrid reasoning models, combining traditional LLM capabilities with advanced, dynamic reasoning. Qwen3 models can seamlessly switch between thinking mode for complex, multi-step tasks such as mathematics, coding, and logical deduction and non-thinking mode for fast, general-purpose responses. For developers accessing Qwen3 through API, the model offers granular control over thinking duration (up to 38K tokens), enabling an optimized balance between intelligent performance and compute efficiency. Notably, the Qwen3-235B-A22B MoE model significantly lowers deployment costs compared to other state-of-the-art models, reinforcing Alibaba's commitment to accessible, high-performance AI. Breakthroughs in Multilingual Skills, Agent Capabilities, Reasoning and Human Alignment Trained on a massive dataset of 36 trillion tokens – double that of its predecessor Qwen2.5 — Qwen3 delivers significant advancement on reasoning, instruction following, tool use and multilingual tasks. Key capabilities include: Multilingual Mastery: Supports 119 languages and dialects, with leading performance in translation and multilingual instruction-following. Advanced Agent Integration: Natively supports the Model Context Protocol (MCP) and robust function-calling, leading open-source models in complex agent-based tasks. Superior Reasoning: Surpasses previous Qwen models (QwQ in thinking mode and Qwen2.5 in non-thinking mode) in mathematics, coding, and logical reasoning benchmarks. Enhanced Human Alignment: Delivers more natural creative writing, role-playing, and multi-turn dialogue experiences for more natural, engaging conversations. Qwen3 models achieve top-tier results across industry benchmarks Thanks to advancements in model architecture, increase in training data, and more effective training methods, Qwen3 models achieve top-tier results across industry benchmarks such as AIME25 (mathematical reasoning), LiveCodeBench (coding proficiency), BFCL (tool and function-calling capabilities), and Arena-Hard (benchmark for instruction-tuned LLMs). Additionally, to develop the hybrid reasoning model, a four-stage training process was implemented, which includes long chain-of-thought (CoT) cold start, reasoning-based reinforcement learning (RL), thinking mode fusion, and general RL. Open Access to Drive Innovation: Qwen3 models are now freely available for download on Hugging Face, Github, and ModelScope, and can be explored on API access will soon be available through Alibaba's AI model development platform Model Studio. Qwen3 also powers Alibaba's flagship AI super assistant application, Quark. Since its debut, the Qwen model family has attracted over 300 million downloads worldwide. Developers have created more than 100,000 Qwen-based derivative models on Hugging Face, making Qwen one of the world's most widely adopted open-source AI model series. About Alibaba Group: Alibaba Group's mission is to make it easy to do business anywhere. The company aims to build the future infrastructure of commerce. It envisions that its customers will meet, work and live at Alibaba, and that it will be a good company that lasts for 102 years.

Alibaba's Qwen3 Challenges AI Giants with Hybrid Reasoning Breakthrough
Alibaba's Qwen3 Challenges AI Giants with Hybrid Reasoning Breakthrough

Arabian Post

time30-04-2025

  • Business
  • Arabian Post

Alibaba's Qwen3 Challenges AI Giants with Hybrid Reasoning Breakthrough

Alibaba has unveiled Qwen3, a new family of open-source large language models that the company describes as a significant milestone in the journey toward artificial general intelligence and artificial superintelligence . The flagship model, Qwen3-235B-A22B, has demonstrated superior performance in several benchmarks, surpassing models from OpenAI and DeepSeek. The Qwen3 series comprises six dense models and two Mixture-of-Experts models, ranging from 0.6 billion to 235 billion parameters. Notably, the Qwen3-235B-A22B model, with 22 billion active parameters, has achieved leading scores in various evaluations. On the Codeforces platform, it attained an Elo rating of 2056, outperforming DeepSeek-R1 and Google's Gemini 2.5 Pro. In the AIME'24 and AIME'25 math benchmarks, it scored 85.7 and 81.4, respectively, indicating strong mathematical reasoning capabilities. A distinctive feature of Qwen3 is its hybrid reasoning capability, allowing dynamic switching between 'thinking' and 'non-thinking' modes. The 'thinking' mode is designed for complex, multi-step tasks such as mathematics, coding, and logical deduction, while the 'non-thinking' mode facilitates fast, general-purpose responses. This dual-mode operation aims to optimize performance across a broad range of applications. The models have been trained on 36 trillion tokens across 119 languages and dialects, enhancing their multilingual capabilities. This extensive training dataset contributes to Qwen3's proficiency in handling diverse linguistic contexts, making it a versatile tool for global applications. In terms of deployment efficiency, Qwen3-235B-A22B offers significant advantages. Despite its large parameter count, the model activates only 22 billion parameters during inference, reducing computational requirements and associated costs. This efficiency is particularly beneficial for developers seeking high-performance models without prohibitive resource demands. The Qwen3 series is available under the Apache 2.0 license, promoting accessibility and collaboration within the AI research community. Developers can access the models through platforms such as Hugging Face and GitHub, facilitating integration into various applications, including mobile devices, smart glasses, autonomous vehicles, and robotics. Alibaba's release of Qwen3 represents a strategic move to position itself as a leader in the AI domain, challenging established players like OpenAI and Google. The company's focus on hybrid reasoning and multilingual support reflects a commitment to advancing AI capabilities in a manner that is both innovative and inclusive.

Alibaba Launches Qwen3 AI Models, Taking Bold Aim at ChatGPT and Gemini
Alibaba Launches Qwen3 AI Models, Taking Bold Aim at ChatGPT and Gemini

Hans India

time29-04-2025

  • Business
  • Hans India

Alibaba Launches Qwen3 AI Models, Taking Bold Aim at ChatGPT and Gemini

Alibabahas made another major move in the AI race by launching a new series of largelanguage models called Qwen3, designed to compete directly with OpenAI'sChatGPT and Google's Gemini. The announcement came on Monday through a detailedpost on X, where the Chinese tech giant outlined the capabilities of its latestAI models. 'We areexcited to announce the release of Qwen3, the latest addition to the Qwenfamily of large language models,' Alibaba stated in an official blog post. Theflagship model, Qwen3-235B-A22B, reportedly delivers impressive resultsin areas such as mathematics, coding, and general reasoning. According toAlibaba, its performance rivals or even surpasses top-tier models likeDeepSeek-R1, o1, o3-mini, Grok-3, and Gemini-2.5-Pro. Introducing Qwen3! We release and open-weight Qwen3, our latest large language models, including 2 MoE models and 6 dense models, ranging from 0.6B to 235B. Our flagship model, Qwen3-235B-A22B, achieves competitive results in benchmark evaluations of coding, math, general… — Qwen (@Alibaba_Qwen) April 28, 2025 Onestandout feature of Qwen3 is its multilingual support. The models areequipped to understand and generate content in 119 languages, includingIndian languages like Hindi, Gujarati, Marathi, Punjabi, Bengali, Sindhi, andeven region-specific dialects such as Chhattisgarhi, Maithili, and Awadhi. The Qwen3family comprises eight models, with sizes ranging from 0.6 billion to235 billion parameters. This includes both dense models and Mixtureof Experts (MoE) architectures, providing options that cater to differentperformance needs and computational constraints. Alibabahighlights the power of its top-tier models:'The small MoE model, Qwen3-30B-A3B, outcompetes QwQ-32B with 10 times ofactivated parameters, and even a tiny model like Qwen3-4B can rival theperformance of Qwen2.5-72B-Instruct,' the company noted. We also evaluated the preliminary performance of Qwen3-235B-A22B on the open-source coding agent Openhands. It achieved 34.4% on Swebench-verified, achieving competitive results with fewer parameters! Thanks to @allhands_ai for providing an easy-to-use agent. Both open models and… — Qwen (@Alibaba_Qwen) April 29, 2025 Topromote open research and development, Alibaba is releasing open-weightversions of both large and compact models. These include the 235B and 30Bparameter MoE models, along with six dense models under the Apache 2.0 license:Qwen3-32B, 14B, 8B, 4B, 1.7B, and 0.6B. Themodels are readily available on platforms like Hugging Face, ModelScope,and Kaggle, with both pre-trained and post-trained variants. Fordeployment, Alibaba recommends tools like SGLang and vLLM, whilelocal setups can use Ollama, LMStudio, MLX, KTransformers. Qwen3 exhibits scalable and smooth performance improvements that are directly correlated with the computational reasoning budget allocated. This design enables users to configure task-specific budgets with greater ease, achieving a more optimal balance between cost efficiency and… — Qwen (@Alibaba_Qwen) April 28, 2025 One ofthe most innovative aspects of Qwen3 is its scalable performancearchitecture. Users can customise the AI's output quality based onavailable compute resources, striking a balance between speed, cost, and depthof understanding. This is particularly useful for coding and complex,multi-step reasoning tasks. We also evaluated the preliminary performance of Qwen3-235B-A22B on the open-source coding agent Openhands. It achieved 34.4% on Swebench-verified, achieving competitive results with fewer parameters! Thanks to @allhands_ai for providing an easy-to-use agent. Both open models and… — Qwen (@Alibaba_Qwen) April 29, 2025

Alibaba launches Qwen3 AI, again challenges ChatGPT and Google Gemini
Alibaba launches Qwen3 AI, again challenges ChatGPT and Google Gemini

India Today

time29-04-2025

  • Business
  • India Today

Alibaba launches Qwen3 AI, again challenges ChatGPT and Google Gemini

The Chinese tech company behind Aliexpress, Alibaba, announced on Monday that it has launched a new family of AI models called Qwen3, which in some cases is better than OpenAI's ChatGPT and Google's Gemini AI models. The company shared a long post on X revealing its new AI models. 'We are excited to announce the release of Qwen3, the latest addition to the Qwen family of large language models. Our flagship model, Qwen3-235B-A22B, achieves competitive results in benchmark evaluations of coding, math, general capabilities, etc., when compared to other top-tier models such as DeepSeek-R1, o1, o3-mini, Grok-3, and Gemini-2.5-Pro,' the company wrote in an official blog post. advertisementAlibaba says Qwen3 AI models support 119 languages including Hindi, Gujarati, Marathi, Chhattisgarhi, Awadhi, Maithili, Bhojpuri, Sindhi, Punjabi, Bengali, Oriya, Magahi and Urdu. Qwen3 features eight models ranging from 0.6B to 235B parameters. These include both dense and Mixture of Experts (MoE) architectures, designed to cater to various performance and efficiency needs. The top-performing model, Qwen3-235B-A22B, according to Alibaba, delivers strong results across key benchmarks such as math, coding, and general reasoning. 'The small MoE model, Qwen3-30B-A3B, outcompetes QwQ-32B with 10 times of activated parameters, and even a tiny model like Qwen3-4B can rival the performance of Qwen2.5-72B-Instruct,' the company claims. The compact Qwen3-4B rivals the much larger Qwen2.5-72B-Instruct. Screenshot: Qwen blog post advertisement'We are open-weighting two MoE models: Qwen3-235B-A22B, a large model with 235 billion total parameters and 22 billion activated parameters, and Qwen3-30B-A3B, a smaller MoE model with 30 billion total parameters and 3 billion activated parameters. Additionally, six dense models are also open-weighted, including Qwen3-32B, Qwen3-14B, Qwen3-8B, Qwen3-4B, Qwen3-1.7B, and Qwen3-0.6B, under Apache 2.0 license,' the company writes in its blog models are available on Hugging Face, ModelScope, and Kaggle, with both pre-trained and post-trained versions (e.g., Qwen3-30B-A3B and its base variant). For deployment, Alibaba recommends SGLang and vLLM, while local use is supported through tools like Ollama, LMStudio, MLX, and KTransformers. Alibaba says the Qwen3 models offer scalable performance, which means that they can adjust quality of the response based on the compute budget, in turn enabling an optimal balance between speed, cost and capability. They're especially well-suited for coding tasks and agent-based interactions, with improved multi-step reasoning. Alibaba says the Qwen3 models also come with something called hybrid thinking. There is a thinking mode, which processes information step-by-step, taking the time to deliberate before delivering a final answer. Then there is a non-thinking mode which allows the model to generate immediate responses, prioritising speed over depth. This dual-mode system gives users control over the depth of computation depending on the task. 'This flexibility allows users to control how much 'thinking' the model performs based on the task at hand,' says Alibaba. 'This design enables users to configure task-specific budgets with greater ease, achieving a more optimal balance between cost efficiency and inference quality.'

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store