logo
#

Latest news with #languageModel

OpenAI's open language model is imminent
OpenAI's open language model is imminent

The Verge

time09-07-2025

  • Business
  • The Verge

OpenAI's open language model is imminent

Microsoft's complicated relationship with OpenAI is about to take an interesting turn. As the pair continue to renegotiate a contract to allow OpenAI to restructure into a for-profit company, OpenAI is preparing to release an open language AI model that could drive even more of a wedge between the two companies. Sources familiar with OpenAI's plans tell me that CEO Sam Altman's AI lab is readying an open-weight model that will debut as soon as next week with providers other than just OpenAI and Microsoft's Azure servers. OpenAI's models are typically closed-weight, meaning the weights (a type of training parameter) aren't available publicly. The open nature of OpenAI's upcoming language model means companies and governments will be able to run the model themselves, much like how Microsoft and other cloud providers quickly onboarded DeepSeek's R1 model earlier this year. I understand this new open language model will be available on Azure, Hugging Face, and other large cloud providers. Sources describe the model as 'similar to o3 mini,' complete with the reasoning capabilities that have made OpenAI's latest models so powerful. OpenAI has been demoing this open model to developers and researchers in recent months, and it has been openly soliciting feedback from the broader AI community. I reached out to OpenAI to comment on the imminent arrival of its open model, but the company did not respond in time for publication. It's the first time that OpenAI has released an open-weight model since its release of GPT-2 in 2019, and it's also the first time we've seen an open language model from OpenAI since it signed an exclusive cloud provider agreement with Microsoft in 2023. That deal means Microsoft has access to most of OpenAI's models, alongside exclusive rights to sell them directly to businesses through its own Azure OpenAI services. But with an open model, there's nothing to stop rival cloud operators from hosting a version of it. As I revealed in Notepad last month, there's a complicated revenue-sharing relationship between Microsoft and OpenAI that involves Microsoft receiving 20 percent of the revenue that OpenAI earns for ChatGPT and the AI startup's API platform. Microsoft also shares 20 percent of its Azure OpenAI revenue directly with OpenAI. This new open model from OpenAI will likely have an impact on Microsoft's own AI business. The open model could mean some Azure customers won't need pricier options, or they could even move to rival cloud providers. Microsoft's lucrative exclusivity deal with OpenAI has already been tested in recent months. Microsoft 'evolved' its OpenAI deal earlier this year to allow the AI lab to get its own AI compute from rivals like Oracle. While that was limited to the servers used for building AI models, this new open model will extend far beyond the boundaries of ChatGPT and Azure OpenAI. Microsoft still has first right of refusal to provide computing resources for OpenAI, but it has no control over an open language model. OpenAI is preparing to announce the language model as an 'open model,' but that terminology, which often gets confused with open-source, is bound to generate a lot of debate around just how open it is. That will all come down to what license is attached to it and whether OpenAI is willing to provide full access to the model's code and training details, which can then be fully replicated by other researchers. Altman said in March that this open-weight language model would arrive 'in the coming months.' I understand it's now due next week, but OpenAI's release dates often change like the wind, in response to development challenges, server capacity, rival AI announcements, and even leaks. Still, I'd expect it to debut this month if all goes well. I'm always keen to hear from readers, so please drop a comment here, or you can reach me at notepad@ if you want to discuss anything else. If you've heard about any of Microsoft's secret projects, you can reach me via email at notepad@ or speak to me confidentially on the Signal messaging app, where I'm tomwarren.01. I'm also tomwarren on Telegram, if you'd prefer to chat there. Thanks for subscribing to Notepad.

Huawei's AI lab denies that one of its Pangu models copied Alibaba's Qwen
Huawei's AI lab denies that one of its Pangu models copied Alibaba's Qwen

Yahoo

time07-07-2025

  • Business
  • Yahoo

Huawei's AI lab denies that one of its Pangu models copied Alibaba's Qwen

BEIJING/SHANGHAI (Reuters) -Huawei's artificial intelligence research division has rejected claims that a version of its Pangu Pro large language model has copied elements from an Alibaba model, saying that it was independently developed and trained. The division, called Noah Ark Lab, issued the statement on Saturday, a day after an entity called HonestAGI posted an English-language paper on code-sharing platform Github, saying Huawei's Pangu Pro Moe (Mixture of Experts) model showed "extraordinary correlation" with Alibaba's Qwen 2.5 14B. This suggests that Huawei's model was derived through "upcycling" and was not trained from scratch, the paper said, prompting widespread discussion in AI circles online and in Chinese tech-focused media. The paper added that its findings indicated potential copyright violation, the fabrication of information in technical reports and false claims about Huawei's investment in training the model. Noah Ark Lab said in its statement that the model was "not based on incremental training of other manufacturers' models" and that it had "made key innovations in architecture design and technical features." It is the first large-scale model built entirely on Huawei's Ascend chips, it added. It also said that its development team had strictly adhered to open-source license requirements for any third-party code used, without elaborating which open-source models it took reference from. Alibaba did not immediately respond to a Reuters request for comment. Reuters was unable to contact HonestAGI or learn who is behind the entity. The release of Chinese startup DeepSeek's open-source model R1 in January this year shocked Silicon Valley with its low cost and sparked intense competition between China's tech giants to offer competitive products. Qwen 2.5-14B was released in May 2024 and is one of Alibaba's small-sized Qwen 2.5 model family which can be deployed on PC and smartphones. While Huawei entered the large language model arena early with its original Pangu release in 2021, it has since been perceived as lagging behind rivals. It open-sourced its Pangu Pro Moe models on Chinese developer platform GitCode in late June, seeking to boost the adoption of its AI tech by providing free access to developers. While Qwen is more consumer-facing and has chatbot services like ChatGPT, Huawei's Pangu models tend to be more used in government as well as the finance and manufacturing sectors. Error while retrieving data Sign in to access your portfolio Error while retrieving data Error while retrieving data Error while retrieving data Error while retrieving data

A ‘Sputnik' moment in the global AI race
A ‘Sputnik' moment in the global AI race

Japan Times

time03-07-2025

  • Science
  • Japan Times

A ‘Sputnik' moment in the global AI race

When Chinese AI startup DeepSeek unveiled the open-source large language model DeepSeek-R1 in January, many referred to it as the "AI Sputnik shock" — a reference to the monumental significance of the Soviet Union's 1957 launch of the first satellite into orbit. Much remains uncertain about DeepSeek's LLM and its capabilities should not be overestimated — but its release nevertheless has sparked intense discussion about its superiority especially in terms of cost. DeepSeek claims that its model possesses reasoning abilities on par with or even superior to OpenAI's leading models, with training costs at less than one-tenth of OpenAI's — reportedly just $5.6 million — largely due to the use of NVIDIA's lower-cost H800 GPUs rather than the more powerful H200 or H100 models. Tech giants like Meta and Google have spent billions of dollars on high-performance GPUs to develop cutting-edge AI models. However, DeepSeek's ability to produce a high-performance AI model at a significantly lower cost challenges the prevailing belief that computational power—determined by the number and quality of GPUs—is the primary driver of AI performance.

Microsoft Launches 'Mu,' New On-Device AI Model for Copilot+ PCs
Microsoft Launches 'Mu,' New On-Device AI Model for Copilot+ PCs

Yahoo

time27-06-2025

  • Business
  • Yahoo

Microsoft Launches 'Mu,' New On-Device AI Model for Copilot+ PCs

Microsoft Corporation (NASDAQ:MSFT) is one of the best US tech stocks to buy now. On June 23, Microsoft officially launched a new small language model called Mu. The AI tool is designed for efficient local operation on personal computers, particularly the new Copilot+ PCs. Unlike larger AI models that rely on cloud processing, Mu operates entirely on a device's Neural Processing Unit/NPU, which enables rapid responses while consuming less power and memory. Mu is an efficient 330 million parameter encoder-decoder language model optimized for small-scale deployment on NPUs. Its design was carefully tuned to fit the hardware's parallelism and memory limits, and ensure peak efficiency for operations. The model's development used insights gained from Microsoft's earlier Phi models and was pre-trained on hundreds of billions of high-quality educational tokens. A development team working together to create the next version of Windows. To enhance its performance despite having fewer parameters, Mu was fine-tuned using advanced techniques such as distillation and low-rank adaptation, and it also incorporates transformer upgrades like Dual LayerNorm, Rotary Positional Embeddings/RoPE, and Grouped-Query Attention/GQA. Initially, the Mu model will be applied to the Settings function within the Windows system, using natural language processing to convert user inputs into system commands. Microsoft Corporation (NASDAQ:MSFT) develops and supports global software, services, devices, and solutions. While we acknowledge the potential of MSFT as an investment, we believe certain AI stocks offer greater upside potential and carry less downside risk. If you're looking for an extremely undervalued AI stock that also stands to benefit significantly from Trump-era tariffs and the onshoring trend, see our free report on the . READ NEXT: and . Disclosure: None. This article is originally published at Insider Monkey.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store