Latest news with #inference

Saudi Arabia hosts first regional deployment of OpenAI models through HUMAIN-Groq partnership

Arab News

4 days ago

Business
Arab News

Saudi Arabia hosts first regional deployment of OpenAI models through HUMAIN-Groq partnership

RIYADH: Saudi Arabia has become the first country in the region to host OpenAI's newly released publicly available models through a deployment announced by HUMAIN and Groq. The gpt-oss-120B and gpt-oss-20B models are operated on Groq's high-speed inference infrastructure located within HUMAIN's sovereign data centers in the Kingdom. The move is part of broader efforts to localize advanced artificial intelligence infrastructure, aligning with national regulatory and data sovereignty requirements. Saudi Arabia's deployment of OpenAI's open-source models within domestic infrastructure supports a wider strategy to diversify its economy and position itself as a key player in global AI. Under Vision 2030, the Kingdom envisions a digital economy powered by AI, investing heavily in sovereign compute infrastructure to support emerging markets across Africa and Asia. In a regional first, HUMAIN, a PIF company, has deployed OpenAI's new open-source models, fully hosted in HUMAIN's sovereign data centers in Saudi Arabia, in partnership with Groq. A strategic partnership, this fusion of frontier models, best-in-class inference technology, and… — HUMAIN (@HUMAINAI) August 6, 2025 HUMAIN, a company backed by the Public Investment Fund, said the deployment will enable Saudi-based developers, researchers, and enterprises to access AI tools that were previously limited by infrastructure or compliance constraints. Groq, a US-based company specializing in AI inference hardware, provides a custom-built processing platform designed to deliver consistent, high-speed performance. HUMAIN CEO Tareq Amin described the development as a step forward in achieving technological self-reliance. 'With the deployment of OpenAI's most powerful open models, hosted right here inside the Kingdom, Saudi developers, researchers, and enterprises now have direct access to the global frontier of AI — fully aligned with our national regulations and data laws,' he said. The company claims that the gpt-oss-120B model operates at more than 500 tokens per second, while the gpt-oss-20B exceeds 1,000 tokens per second on its platform. The establishment of HUMAIN by PIF in May, backed by commitments from Nvidia, AMD, Cisco, and Amazon Web Services, illustrates this push, with multi‑billion‑dollar agreements to expand local AI compute capacity, data centers, and foundational models. .@TareqAmin_, CEO of HUMAIN, announced the full-scale deployment of OpenAI's newly-released open-source models, hosted entirely within Saudi Arabia in HUMAIN's next-generation sovereign data centers, on @GroqInc's ultra-high-speed inference platform. This is a defining moment… — HUMAIN (@HUMAINAI) August 6, 2025 The infrastructure is positioned as fully sovereign, meaning all data handling complies with Saudi regulations. This could be significant for organizations in the public and private sectors that require local hosting of data-intensive applications. The companies did not disclose commercial terms or usage projections. Groq CEO Jonathan Ross said the partnership expands the company's reach into the Middle East. 'Our partnership with HUMAIN gives us a powerful regional and globally central presence in one of the fastest-growing AI ecosystems on the planet,' Ross said. The announcement builds on a partnership first disclosed in May and aligns with Saudi Arabia's national strategy to become a competitive player in global AI development. HUMAIN had previously stressed its ambition to develop AI capabilities across infrastructure, foundational models, and sector-specific applications.

Groq and Saudi's HUMAIN announce day zero launch of OpenAI's latest models

Arabian Business

5 days ago

Business
Arabian Business

Groq and Saudi's HUMAIN announce day zero launch of OpenAI's latest models

Groq, the Silicon Valley-based AI company, and HUMAIN, a PIF company and Saudi Arabia's leading AI services provider, have announced the immediate availability of OpenAI's two open models on GroqCloud. This will deliver gpt-oss-120B and gpt-oss-20B with full 128K context, real-time responses, and integrated server-side tools live on Groq's optimised inference platform from day zero. OpenAI's open models are live and already running on Groq. Try gpt-oss-20B and gpt-oss-120B today. Groq delivers 128K context and built-in tools such as code execution and browser search. For the first time, developers and enterprises can deploy open models backed by OpenAI… — Groq Inc (@GroqInc) August 5, 2025 Groq and HUMAIN launch OpenAI models Groq is the AI inference platform redefining price performance. Its custom-built LPU and cloud have been specifically designed to run powerful models instantly, reliably, and at the lowest cost per token—without compromise. Over 1.9 million developers trust Groq to build fast and scale smarter. HUMAIN is owned by the Public Investment Fund (PIF) and is a global AI company delivering full-stack capabilities across four core areas – next-generation data centres, hyper-performance infrastructure and cloud platforms, advanced AI models, including the world's most advanced Arabic multimodal LLMs, and transformative AI solutions that combine deep sector insight with real-world execution. In February this year at LEAP 2025, Saudi Arabia committed US$1.5 billion in funding to Groq for expanded delivery of its advanced LPU-based AI inference infrastructure. The agreement followed the operational excellence Groq demonstrated in building the region's largest inference cluster in December 2024. Brought online in just eight days, the rapid installation established a critical AI hub to serve surging compute demand globally. From its data centre in Dammam, Groq is now delivering market-leading AI inference capabilities to customers worldwide through GroqCloud, and the new announcement is another extension of that partnership. Groq's global data centre footprint across North America, Europe, and the Middle East ensures reliable, high-performance AI inference. Through GroqCloud, the new open models are now available worldwide with minimal latency. Groq's purpose-built stack delivers the lowest cost per token for OpenAI's new models while maintaining speed and accuracy. For a limited time, tool calls used with OpenAI's open models will not be charged. gpt-oss-120B is currently running at 500+ t/s and gpt-oss-20B is currently running at 1000+ t/s on GroqCloud. It is available at $0.15/M input tokens and $0.75/M output tokens. gpt-oss-20B is available at $0.10/M input tokens and $0.50/M output tokens. Groq has long supported OpenAI's open-source efforts, including the large-scale deployment of Whisper. This launch builds on that foundation, bringing their newest models to production with global access and local support through HUMAIN. Jonathan Ross, CEO of Groq, said: 'OpenAI is setting a new high-performance standard in open source models. Groq was built to run models like this – fast and affordably – so developers everywhere can use them from day zero. Working with HUMAIN strengthens local access and support in the Kingdom of Saudi Arabia, empowering developers in the region to build smarter and faster.' Tareq Amin, CEO at HUMAIN, added: 'Groq delivers the unmatched inference speed, scalability, and cost-efficiency we need to bring cutting-edge AI to the Kingdom. Together, we're enabling a new wave of Saudi innovation—powered by the best open-source models and the infrastructure to scale them globally. We're proud to support OpenAI's leadership in open-source AI.' To make the most of OpenAI's new models, Groq is delivering extended context and built-in tools like code execution and web search. Web search helps provide real-time relevant information, while code execution enables reasoning and complex workflows. The platform delivers these capabilities from day zero with a full 128k token context length.

Myrtle.ai Enables Microsecond ML Inference Latencies running VOLLO on Napatech SmartNICs

Associated Press

5 days ago

Business
Associated Press

Myrtle.ai Enables Microsecond ML Inference Latencies running VOLLO on Napatech SmartNICs

CAMBRIDGE, England, Aug. 6, 2025 /PRNewswire/ -- a recognized leader in accelerating machine learning inference, today released support for its VOLLO® inference accelerator on the NT400D1x series of SmartNICs from Napatech. VOLLO achieves industry-leading ML inference compute latencies, which can be less than one microsecond. This new release enables those who need the very lowest latencies possible to run inference next to the network in a SmartNIC. A wide range of models may be run on VOLLO, including LSTM, CNN, MLP, as well as Random Forests and Gradient Boosting decision trees. This has been developed to meet the needs of a wide range of applications including financial trading, wireless telecommunications, cyber security, network management and others, where running ML inference at the lowest possible latency confers advantages in security, safety, profit, efficiency and cost. 'We're excited to be working with the world leader in SmartNIC sales to enable unprecedented low latencies for ML inference.' said Peter Baldwin, CEO of 'Our customers are aggressively seeking ever lower latencies and this new release will enable them to leverage the full benefit of VOLLO's latency leadership.' Jarrod J.S. Siket, Chief Product & Marketing Officer at Napatech was excited about the prospects. 'We recognized that the latency leader in the STAC® ML benchmarks could bring real value to our customers in the finance market as they increase their adoption of ML for auto trading.' he said. 'The VOLLO compiler is designed to make it very easy for ML developers to use our SmartNICs and this really strengthens our portfolio of products and services.' Interested parties may now download the ML-oriented VOLLO compiler from today and discover what latencies can be achieved with their models on the NT400D1x series of SmartNICs from Napatech. About is an AI/ML software company that delivers world class inference accelerators on FPGA-based platforms from all the leading FPGA suppliers. With neural network expertise across the complete spectrum of ML networks, Myrtle has delivered accelerators for FinTech, Speech Processing, and Recommendation. 'STAC' and all STAC names are trademarks or registered trademarks of the Securities Technology Analysis Center, LLC. Photo: Logo: View original content to download multimedia: SOURCE

Groq, HUMAIN launch OpenAI's new models with real-time access

Gulf Business

5 days ago

Business
Gulf Business

Groq, HUMAIN launch OpenAI's new models with real-time access

Image: Getty Images/ For illustrative purposes Groq, a US-based AI inference platform, and Saudi Arabia's HUMAIN, a Public Investment Fund ( The launch provides developers with full 128,000-token context windows, real-time inference, and access to integrated tools such as server-side code execution and web search. The offering is now live with regional support delivered through HUMAIN, Saudi Arabia's leading AI services provider. OpenAI elevates standards 'OpenAI is setting a new high-performance standard in open-source models,' said Jonathan Ross, CEO of Groq. He added: 'Groq was built to run models like this, fast and affordably, so developers everywhere can use them from day zero. Working with HUMAIN strengthens local access and support in Saudi Arabia.' HUMAIN aims to drive AI adoption in Saudi The two models — gpt-oss-120B and gpt-oss-20B — are designed to run at high throughput on GroqCloud, with the 120B model processing over 500 tokens per second (t/s) and the 20B model over 1,000 t/s. Pricing is set at: gpt-oss-120B: $0.15 per million input tokens, $0.75 per million output tokens gpt-oss-20B: $0.10 per million input tokens, $0.50 per million output tokens For a limited time, Groq is waiving Groq's global infrastructure, including data centers across North America, Europe, and the Middle East, supports broad accessibility with low latency. The partnership also extends Groq's support of OpenAI's open-source initiatives, building on earlier deployments such as Whisper. HUMAIN, which focuses on full-stack AI capabilities including data centres, LLMs, and applied AI, is positioning this partnership as part of Saudi Arabia's push for leadership in AI IP and digital infrastructure.

Groq and HUMAIN Launch OpenAI's New Open Models Day Zero

Yahoo

6 days ago

Business
Yahoo

Groq and HUMAIN Launch OpenAI's New Open Models Day Zero

Available worldwide with real-time performance, low cost, and local support in Saudi Arabia PALO ALTO, Calif. and RIYADH, Saudi Arabia, Aug. 5, 2025 /CNW/ -- Groq, the pioneer in fast inference, and HUMAIN, a PIF company and Saudi Arabia's leading AI services provider, today announced the immediate availability of OpenAI's two open models on GroqCloud. The launch delivers gpt-oss-120B and gpt-oss-20B with full 128K context, real-time responses, and integrated server-side tools live on Groq's optimized inference platform from day zero. Groq has long supported OpenAI's open-source efforts, including large-scale deployment of Whisper. This launch builds on that foundation, bringing their newest models to production with global access and local support through HUMAIN. "OpenAI is setting a new high performance standard in open source models," said Jonathan Ross, CEO of Groq. "Groq was built to run models like this, fast and affordably, so developers everywhere can use them from day zero. Working with HUMAIN strengthens local access and support in the Kingdom of Saudi Arabia, empowering developers in the region to build smarter and faster." "Groq delivers the unmatched inference speed, scalability, and cost-efficiency we need to bring cutting-edge AI to the Kingdom," said Tareq Amin, CEO at HUMAIN. "Together, we're enabling a new wave of Saudi innovation—powered by the best open-source models and the infrastructure to scale them globally. We're proud to support OpenAI's leadership in open-source AI." Built for full model capabilities To make the most of OpenAI's new models, Groq delivers extended context and built-in tools like code execution and web search. Web search helps provide real-time relevant information, while code execution enables reasoning and complex workflows. Groq's platform delivers these capabilities from day zero with a full 128k token context length. Unmatched price-performance Groq's purpose-built stack delivers the lowest cost per token for OpenAI's new models while maintaining speed and accuracy. gpt-oss-120B is currently running at 500+ t/s and gpt-oss-20B is currently running at 1000+ t/s on GroqCloud. Groq is offering OpenAI's latest open models at the following pricing: gpt-oss-120B: $0.15 / M input tokens and $0.75 / M output tokens gpt-oss-20B: $0.10 / M input tokens and $0.50 / M output tokens Note: For a limited time, tool calls used with OpenAI's open models will not be charged. Learn more at Global from day zero Groq's global data center footprint across North America, Europe, and the Middle East ensures reliable, high-performance AI inference wherever developers operate. Through GroqCloud, OpenAI's open models are now available worldwide with minimal latency. About Groq Groq is the AI inference platform redefining price performance. Its custom-built LPU and cloud have been specifically designed to run powerful models instantly, reliably, and at the lowest cost per token—without compromise. Over 1.9 million developers trust Groq to build fast and scale smarter. Contact: pr-media@ About HUMAIN HUMAIN, a PIF company, is a global artificial intelligence company delivering full-stack AI capabilities across four core areas - next-generation data centers, hyper-performance infrastructure & cloud platforms, advanced AI Models, including the world's most advanced Arabic multimodal LLMs, and transformative AI Solutions that combine deep sector insight with real-world execution. HUMAIN's end-to-end model serves both public and private sector organisations, unlocking exponential value across all industries, driving transformation and strengthening capabilities through human-AI synergies. With a growing portfolio of sector-specific AI products and a core mission to drive IP leadership and talent supremacy world-wide, HUMAIN is engineered for global competitiveness and national distinction. View original content to download multimedia: SOURCE Groq View original content to download multimedia:

Latest news with #inference

Saudi Arabia hosts first regional deployment of OpenAI models through HUMAIN-Groq partnership

Groq and Saudi's HUMAIN announce day zero launch of OpenAI's latest models

Myrtle.ai Enables Microsecond ML Inference Latencies running VOLLO on Napatech SmartNICs

Groq, HUMAIN launch OpenAI's new models with real-time access

Groq and HUMAIN Launch OpenAI's New Open Models Day Zero

Get Started Now: Download the App