logo
OpenAI GPT-OSS Models Optimized for NVIDIA RTX GPUs

OpenAI GPT-OSS Models Optimized for NVIDIA RTX GPUs

Geeky Gadgets3 days ago
NVIDIA and OpenAI have collaborated to release the gpt-oss family of open-source AI models, optimized for NVIDIA RTX GPUs. These models, gpt-oss-20b and gpt-oss-120b, bring advanced AI capabilities to consumer PCs and workstations, enabling faster and more efficient on-device AI performance.
OpenAI, has unveiled its gpt-oss family of open-weight AI models, specifically optimized for NVIDIA RTX GPUs. These models—gpt-oss-20b and gpt-oss-120b—are designed to deliver advanced AI capabilities to both consumer-grade PCs and professional workstations. By using NVIDIA's innovative GPU technology, the models provide faster on-device performance, enhanced efficiency, and greater accessibility for developers and AI enthusiasts. The latest OpenAI models feature cutting-edge architecture, extended context lengths, and support for various AI applications, making them accessible to developers and enthusiasts through tools like Ollama, llama.cpp, and Microsoft AI Foundry Local. Key Highlights of GPT-OSS Models Two Models, Tailored for Performance
The easiest way to test these models on RTX AI PCs, on GPUs with at least 24GB of VRAM, is using the new Ollama app. Ollama is fully optimized for RTX, making it ideal for consumers looking to experience the power of personal AI on their PC or workstation. The gpt-oss family consists of two distinct models, each tailored to meet specific hardware requirements and performance needs: gpt-oss-20b: Designed for consumer-grade NVIDIA RTX GPUs with at least 16GB of VRAM, such as the RTX 5090. This model achieves processing speeds of up to 250 tokens per second, making it suitable for individual developers and small-scale projects.
Designed for consumer-grade NVIDIA RTX GPUs with at least 16GB of VRAM, such as the RTX 5090. This model achieves processing speeds of up to 250 tokens per second, making it suitable for individual developers and small-scale projects. gpt-oss-120b: Optimized for professional-grade RTX PRO GPUs, this model caters to enterprise and research environments requiring higher computational power and scalability.
Both models support extended context lengths of up to 131,072 tokens, allowing them to handle complex reasoning tasks and process large-scale documents. This capability is particularly advantageous for applications such as legal document analysis, academic research, and other tasks requiring long-form comprehension and detailed analysis. Technological Innovations Driving Efficiency
The gpt-oss models incorporate several technological advancements that enhance their performance and functionality. These innovations include: MXFP4 Precision: The gpt-oss models are the first to support this precision format on NVIDIA RTX GPUs. MXFP4 improves computational efficiency while maintaining output accuracy, reducing resource consumption without compromising performance.
The gpt-oss models are the first to support this precision format on NVIDIA RTX GPUs. MXFP4 improves computational efficiency while maintaining output accuracy, reducing resource consumption without compromising performance. Mixture-of-Experts (MoE) Architecture: This architecture activates only the necessary components of the model for specific tasks, minimizing computational overhead while maintaining high performance. This design ensures efficient resource utilization, particularly for complex or specialized tasks.
This architecture activates only the necessary components of the model for specific tasks, minimizing computational overhead while maintaining high performance. This design ensures efficient resource utilization, particularly for complex or specialized tasks. Chain-of-Thought Reasoning: This feature enables the models to perform step-by-step logical analysis, improving their ability to follow instructions and solve intricate problems. It enhances their effectiveness in real-world applications, such as troubleshooting, decision-making, and problem-solving.
These innovations collectively contribute to the models' ability to deliver high-speed, accurate results across a variety of use cases, making them versatile tools for developers and organizations alike. Versatile Applications and Use Cases
The gpt-oss models are designed to support a wide range of applications and industries, making them highly adaptable tools for diverse needs. Key use cases include: Web Search and Information Retrieval: The models can process and summarize vast amounts of information, making them ideal for search engines and knowledge management systems.
The models can process and summarize vast amounts of information, making them ideal for search engines and knowledge management systems. Coding Assistance: Developers can use the models for code generation, debugging, and optimization, streamlining software development workflows.
Developers can use the models for code generation, debugging, and optimization, streamlining software development workflows. Document Comprehension: With their extended context lengths, the models excel at analyzing lengthy documents, such as legal contracts, research papers, and technical manuals.
With their extended context lengths, the models excel at analyzing lengthy documents, such as legal contracts, research papers, and technical manuals. Multimodal Input Processing: The ability to handle both text and image inputs broadens their applicability, allowing tasks like image captioning, data analysis, and content generation.
The customizable context lengths allow users to tailor the models to specific requirements, whether summarizing extensive documents or generating detailed responses to complex queries. This adaptability makes the gpt-oss models suitable for both general-purpose use and specialized applications, from enterprise workflows to individual projects. Developer Tools for Seamless Integration
To assist adoption and integration, OpenAI and NVIDIA have provided a comprehensive suite of developer tools. These resources simplify the deployment and testing of the gpt-oss models, making sure accessibility for developers of varying expertise levels. Key tools include: Ollama App: An intuitive interface for running and testing the models on NVIDIA RTX GPUs, allowing quick experimentation and deployment.
An intuitive interface for running and testing the models on NVIDIA RTX GPUs, allowing quick experimentation and deployment. llama.cpp Framework: An open-source framework that supports collaboration and optimization, allowing developers to fine-tune the models for specific hardware configurations.
An open-source framework that supports collaboration and optimization, allowing developers to fine-tune the models for specific hardware configurations. Microsoft AI Foundry Local: A set of command-line tools and software development kits (SDKs) designed for Windows developers, allowing seamless integration into existing workflows.
These tools empower developers to experiment with advanced AI solutions without requiring extensive expertise in AI infrastructure, fostering innovation and accessibility. NVIDIA's Role in Advancing AI
The gpt-oss models were trained on NVIDIA H100 GPUs, using NVIDIA's state-of-the-art AI training infrastructure. Once trained, the models are optimized for inference on NVIDIA RTX GPUs, showcasing NVIDIA's leadership in end-to-end AI technology. This approach ensures high-performance AI capabilities on both cloud-based and local devices, making advanced AI more accessible to a broader audience.
Additionally, the models use CUDA Graphs, a feature that minimizes computational overhead and enhances performance. This optimization is particularly valuable for real-time applications, where speed and efficiency are critical. Open-Source Collaboration and Community Impact
The gpt-oss models are open-weight, allowing developers to customize and extend their capabilities. This openness encourages innovation and collaboration within the AI community, allowing the development of tailored solutions for specific use cases.
NVIDIA has also contributed to open-source frameworks such as GGML and llama.cpp, further enhancing the accessibility and performance of the gpt-oss models. These frameworks provide developers with the tools needed to optimize AI models for a variety of hardware configurations, from consumer-grade PCs to enterprise-level systems. Empowering the Future of AI Development
The release of the gpt-oss models highlights a pivotal moment in the evolution of AI technology. By harnessing the power of NVIDIA RTX GPUs, these models deliver exceptional performance, flexibility, and accessibility. Their open-source nature, combined with robust developer tools, positions them as valuable assets for driving innovation across a wide range of applications. Whether for individual developers or large organizations, the gpt-oss models offer a practical and efficient solution for advancing AI-driven projects.
Browse through more resources below from our in-depth content covering more areas on AI models. Filed Under: AI, Technology News, Top News
Latest Geeky Gadgets Deals
Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, Geeky Gadgets may earn an affiliate commission. Learn about our Disclosure Policy.
Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

White House crypto adviser Bo Hines announces departure
White House crypto adviser Bo Hines announces departure

Reuters

time2 hours ago

  • Reuters

White House crypto adviser Bo Hines announces departure

WASHINGTON, Aug 9 (Reuters) - Bo Hines, who headed Republican President Donald Trump's Council of Advisers on Digital Assets, said on Saturday he was leaving his current role and returning to the private sector. Late last month, a cryptocurrency working group led by Hines and including several administration officials outlined the Trump administration's stance on market-defining crypto legislation and called on the U.S. securities regulator to create new rules specific to digital assets. Shortly after taking office in January, Trump had ordered the creation of the crypto working group and tasked it with proposing new regulations, making good on his campaign promise to overhaul U.S. crypto policy. "Serving in President Trump's administration and working alongside our brilliant AI & Crypto Czar @DavidSacks as Executive Director of the White House Crypto Council has been the honor of a lifetime," Hines said in a post on X on Saturday. Sacks, the White House AI czar, praised Hines in response to the post announcing his departure. Hines has twice unsuccessfully run for Congress in North Carolina. Trump last month signed a law to create a regulatory regime for dollar-pegged cryptocurrencies known as stablecoins, a milestone that could pave the way for the digital assets to become an everyday way to make payments and move money. Hines was a backer of that legislation, dubbed the GENIUS Act.

It shocked the US market but has China's DeepSeek changed AI?
It shocked the US market but has China's DeepSeek changed AI?

BBC News

time3 hours ago

  • BBC News

It shocked the US market but has China's DeepSeek changed AI?

US President Donald Trump had been in office scarcely a week when a new Chinese artificial intelligence (AI) app called DeepSeek jolted Silicon DeepSeek-R1 shot to the top of the Apple charts as the most downloaded free app in the firm said at the time its new chatbot rivalled ChatGPT. Not only that. They asserted it had cost a mere fraction to claims – and the app's sudden surge in popularity – wiped $600bn (£446bn) or 17% off the market value of chip giant Nvidia, marking the largest one-day loss for a single stock in the history of the US stock other tech stocks with exposure to AI were caught in the downdraft, also cast doubt on American AI dominance. Up until then, China had been seen as having fallen behind the US. Now, it seemed as though China had catapulted to the capitalist Marc Andreessen referred to the arrival of DeepSeek-R1 as "AI's Sputnik moment," a reference to the Soviet satellite that had kicked off the space race between the US and the USSR more than a half century earlier. Still relevant It has now been six months since DeepSeek stunned the China's breakthrough app has largely dropped out of the headlines. It's no longer the hot topic at happy hour here in San Francisco. But DeepSeek hasn't challenged certain key assumptions about AI that had been championed by American executives like Sam Altman, CEO of ChatGPT-maker OpenAI."We were on a path where bigger was considered better," according to Sid Sheth, CEO of AI chip startup maxing out on data centres, servers, chips, and the electricity to run it all wasn't the way forward after DeepSeek ostensibly not having access to the most powerful tech available at the time, Sheth told the BBC that it showed that "with smarter engineering, you actually can build a capable model".The surge of interest in DeepSeek took hold over a weekend in late January, before corporate IT personnel could move to stop employees from flocking to organisations caught on the following Monday, many scrambled to ban workers from using the app as worries set in about whether user data was potentially being shared with the People's Republic of China, where DeepSeek is while exact numbers aren't available, plenty of Americans still use DeepSeek Silicon Valley start-ups have opted to stick with DeepSeek in lieu of more expensive AI models from US firms in a bid to cut down on investor told me for cash-strapped firms, funds saved by continuing to use DeepSeek are helping to pay for critical needs such as additional headcount. They are, however, being careful. In online forums, users explain how to run DeepSeek-R1 on their own devices rather than online using DeepSeek's servers in China - a workaround they believe can protect their data from being shared surreptitiously."It's a good way to use the model without being concerned about what it's exfiltrating" to China, said Christopher Caen, CEO of Mill Pond Research. US-China rivalry DeepSeek's arrival also marked a turning point in the US-China AI rivalry, some experts say. "China was seen as playing catch-up in large language models until this point, with competitive models but always trailing the best western ones," policy analyst Wendy Chang of the Mercator Institute for China Studies told the BBC.A large language model (LLM) is a reasoning system trained to predict the next word in a given sentence or phrase. DeepSeek changed perceptions when it claimed to have achieved a leading model for a fraction of the computational resources and costs common among its American had spent $5bn (£3.7bn) in 2024 alone. By contrast, DeepSeek researchers said they had developed DeepSeek-R1 – which came out on top of OpenAI's o1 model across multiple benchmarks – for just $5.6m (£4.2m). "DeepSeek revealed the competitiveness of China's AI landscape to the world," Chang AI developers have managed to capitalize on this shift. AI-related deals and other announcements trumpeted by the Trump administration and major American tech companies are often framed as critical to staying ahead of AI czar David Sacks noted the technology would have "profound ramifications for both the economy and national security" when the administration unveiled its AI Action Plan last month."It's just very important that America continues to be the dominant power in AI," Sacks has never managed to quell concerns over the security implications of its Chinese US government has been assessing the company's links to Beijing, as first reported by Reuters in June.A senior US State Department official told the BBC they understood "DeepSeek has willingly provided, and will likely continue to provide, support to China's military and intelligence operations".DeepSeek did not respond to the BBC's request for comment but the company's privacy policy states that its servers are located in the People's Republic of China."When you access our services, your Personal Data may be processed and stored in our servers in the People's Republic of China," the policy says. "This may be a direct provision of your Personal Data to us or a transfer that we or a third-party make." A new approach? Earlier this week, OpenAI reignited talk about DeepSeek after releasing a pair of AI were the first free and open versions – meaning they can be downloaded and modified - released by the American AI giant in five years, well before ChatGPT ushered in the consumer AI era."You can draw a straight line from DeepSeek to what OpenAI announced this week," said d-Matrix's Sheth. "DeepSeek proved that smaller, more efficient models could still deliver impressive performance—and that changed the industry's mindset," Sheth told the BBC. "What we're seeing now is the next wave of that thinking: a shift toward right-sized models that are faster, cheaper, and ready to deploy at scale."But to others, for the major American players in AI, the old approach appears to be alive and days after releasing the free models, OpenAI unveiled GPT-5. In the run-up, the company said it significantly ramped up its computing capacity and AI infrastructure.A slew of announcements about new data centre clusters needed for AI has come as American tech companies have been competing for top-tier AI CEO Mark Zuckerberg has ploughed billions of dollars to fulfil his AI ambitions, and tried to lure staff from rivals with $100m pay fortunes of the tech giants seemed more tethered than ever to their commitment to AI spending, as evidenced by the series of blowout results revealed this past tech earnings shares of Nvidia, which plunged just after DeepSeek's arrival, have rebounded – touching new highs that have made it the world's most valuable company in history."The initial narrative has proven a bit of a red herring," said Mill Pond Research's are back to a future in which AI will ostensibly depend on more data centres, more chips, and more power. In other words, DeepSeek's shake-up of the status quo hasn't what about DeepSeek itself?"DeepSeek now faces challenges sustaining its momentum," said Marina Zhang, an associate professor at the University of Technology due in part to operational setbacks but also to intense competition from companies in the US and China, she notes that the company's next product, DeepSeek-R2, has reportedly been delayed. One reason? A shortage of high-end chips. Sign up for our Tech Decoded newsletter to follow the world's top tech stories and trends. Outside the UK? Sign up here.

Majorie Taylor Greene up 142% in stocks she bought days before ICE awarded company massive contract: ‘Laughable'
Majorie Taylor Greene up 142% in stocks she bought days before ICE awarded company massive contract: ‘Laughable'

The Independent

time4 hours ago

  • The Independent

Majorie Taylor Greene up 142% in stocks she bought days before ICE awarded company massive contract: ‘Laughable'

Rep. Marjorie Taylor Greene has seen her stock in Palantir Technologies surge 142 percent since she invested in April, just days before Immigration and Customs Enforcement handed the company a $30 million contract. The Georgia representative is a member of the House Homeland Security Committee which oversees ICE, and since her investment on April 8, the stock has rocketed, research platform Quiver Quantitative, which tracks politicians' investments, noted. 'Marjorie Taylor Greene bought stock in Palantir on April 8th,' the platform said in a post on X. 'We reported on this right away, because Greene sits on the House Committee on Homeland Security. $PLTR has now risen 142% since her purchase.' On April 11, the artificial intelligence software company was awarded a contract by ICE to support the Trump administration's sweeping anti-immigration agenda, including designing a system to track self-deportation and identify individuals for deportation. Greene has previously shrugged off criticism as 'laughable' and clarified that her financial adviser controls her investments. 'After many successful years of running my own business, I ran for Congress to bring that mindset to Washington. Now that I'm proudly serving the people of Northwest Georgia, I have signed a fiduciary agreement to allow my financial advisor to control my investments,' Greene said in a statement to Snopes when her Palantir investment came to light in May. 'All of my investments are reported with full transparency. I refuse to hide my stock trades in a blind trust like many others do,' she added. 'I learned about my Palantir trades when I saw it in the media.' Greene's comments are reminiscent of former House Speaker Nancy Pelosi, who, along with her husband, has long been dogged by allegations of insider trading, which she denies. There is overwhelming public support to ban the trading of stocks of individual companies by members of Congress. It emerged last month that White House deputy chief of staff Dan Scavino sold up to $5 million worth of Trump Media stock the day before the president's 'Liberation Day' tariffs were announced. Scavino sold stock worth between $1 million and $5 million on April 1, according to financial disclosure reports first obtained by USA Today. Trump Media is the parent company of Trump's TruthSocial social media platform. Scavino sold the day before the president officially announced reciprocal tariffs on U.S. trading partners. The announcement caused the markets to plummet and prompted Trump to put a 90-day pause on the tariffs on April 9, by which point the markets had slumped 12 percent. Stocks for Trump Media fell by about 11 percent. The White House said the sales had 'nothing to do with the tariff announcement' when approached by The Independent at the time.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store