logo
#

Latest news with #inference

SambaNova launches its AI Platform in AWS Marketplace
SambaNova launches its AI Platform in AWS Marketplace

Zawya

time3 days ago

  • Business
  • Zawya

SambaNova launches its AI Platform in AWS Marketplace

Dubai, United Arab Emirates — SambaNova, the AI inference company delivering fast, efficient AI chips and high-performance models, today announced that its AI platform is now available in AWS Marketplace, a digital catalog that helps you find, buy, deploy, and manage software, data products, and professional services from thousands of vendors. This availability allows organizations to seamlessly purchase and deploy SambaNova's fast inference services alongside their existing infrastructure in AWS. This new availability marks a significant milestone in SambaNova's mission to make private, production-grade AI more accessible to enterprises, removing traditional barriers like vendor onboarding and procurement delays. By leveraging existing AWS relationships, organizations can now begin using SambaNova's advanced inference solutions with a few simple clicks —accelerating time to value while maintaining trusted billing and infrastructure practices. 'Enterprises face significant pressure to move rapidly from Ai experimentation to full-scale production, yet procurement and integration challenges often stand in the way,' said Rodrigo Liang, CEO and co-founder of SambaNova. 'By offering SambaNova's platform in AWS Marketplace, we remove those obstacles, enabling organizations to access our industry leading inference solutions instantly, using the procurement processes and cloud environment they already trust. Accelerating Access to High-Performance Inference SambaNova's listing in AWS Marketplace gives customers the ability to: Procure through existing AWS billing arrangements — no new vendor setup required. Leverage SambaNova's inference performance — fast and efficiently, running open source models like Llama 4 Maverick and DeepSeek R1 671B. Engage securely via private connectivity — possible through AWS PrivateLink for low-latency, secure integration between AWS workloads and SambaNova Cloud. 'With the SambaNova platform running in AWS Marketplace, organizations gain access to secure, high-speed inference from the largest open-source models. Solutions like this will help businesses move from experimentation to full production with AI,' said Michele Rosen Research Manager, Open GenAI, LLMs, and the Evolving Open Source, IDC. This tight integration enables customers to deploy high-performance, multi-tenant inference solutions without the need to purchase or manage custom hardware—expanding SambaNova's reach into enterprise environments where time-to-value and IT friction have historically limited adoption. Making High-Performance Inference More Accessible With this listing in AWS Marketplace, SambaNova is meeting enterprise customers where they already are—within their trusted cloud environments and procurement frameworks. By removing onboarding friction and offering seamless integration, SambaNova makes it easier than ever for organizations to evaluate, deploy, and scale high-performance inference solutions. 'This makes it dramatically easier for customers to start using SambaNova—no new contracts, no long onboarding — just click and go,' said Liang. Availability SambaNova's inference platform is available immediately in AWS Marketplace. Enterprise customers can visit the SambaNova listing in AWS Marketplace to get started. About SambaNova Customers turn to SambaNova to quickly deploy state-of-the-art generative AI capabilities within the enterprise. Our purpose-built enterprise-scale AI platform is the technology backbone for the next generation of AI computing. Headquartered in Palo Alto, California, SambaNova Systems was founded in 2017 by industry luminaries, and hardware and software design experts from Sun/Oracle and Stanford University. Investors include SoftBank Vision Fund 2, funds and accounts managed by BlackRock, Intel Capital, GV, Walden International, Temasek, GIC, Redline Capital, Atlantic Bridge Ventures, Celesta, and several others.

World's Largest Chip Sets AI Speed Record, Beating NVIDIA
World's Largest Chip Sets AI Speed Record, Beating NVIDIA

Forbes

time4 days ago

  • Business
  • Forbes

World's Largest Chip Sets AI Speed Record, Beating NVIDIA

Today I held the world's largest computer chip in my hands. And while its size is impressive, its speed is much more impressive, and of course much more important. Most computer chips are tiny, the size of a postage stamp or smaller. By comparison the Cerebras WSE (Wafer Scale Engine) is a massive square 8.5 inches or 22 centimeters on each side, and the latest model boasts a staggering four billion transistors on a single chip. All those billions of transistors let the WSE set a world speed record for AI inference operations: about 2.5 times faster than a roughly equivalent NVIDIA cluster. 'It's the fastest inference in the world,' Cerebras chief information security officer Naor Penso told me today at Web Summit in Vancouver. 'Last week NVIDIA announced hitting 1,000 tokens per second on Llama 4, which is impressive. We just released a benchmark today of 2,500 tokens per second.' In case all this is Greek to you, think of 'inference' as thinking or acting: building sentences, images, or videos in response to your inputs, or prompts. Think of 'tokens' as basic units of thought: a word, character, or symbol. The more tokens an AI engine can process per second, the faster it can get you results. And speed matters. Maybe not so much for you, but when enterprise clients want to add an AI engine to a grocery shopping cart so they can tell you that just one more ingredient will give you everything you need for Korean-Style BBQ Beef Tacos, they want to be able to do so instantly for potentially thousands of people. Interestingly, speed is about to get even more critical. We're entering an agentic age, where we have AIs that can perform complex multi-step projects for us, like planning and booking a weekend trip to Austin for a Formula 1 race. Agents aren't magic: they eat an elephant the exact same way you would … one bite at a time. That means exploding a big overall task into 40, 50, or a 100 sub-tasks. Which means much more work. 'AI agents require way more jobs, and the various jobs need to communicate with each other," Penso told me. 'You can't have slow inference.' The WSE's four billion transistors are a part of what enables that speed. For comparison, the Intel Core i9 has just 33.5 billion transistors, and an Apple M2 Max chip offers just 67 billion transistors. But it's more than sheer number that builds a compute speed demon. It's also co-location: putting everything together on one chip, along with 44 gigabytes of the fastest RAM (memory) available. 'AI compute likes a lot of memory,' Penso says. "NVIDIA needs to go off-chip but with Cerebras, you don't need to go off-chip." Independent agency Artificial Analysis corroborates the speed claims, saying they've tested the chip on Llama 4 and achieved 2,522 tokens per second, compared to NVIDIA Blackwell's 1,038 tokens per second. 'We've tested dozens of vendors, and Cerebras is the only inference solution that outperforms Blackwell for Meta's flagship model,' says Artificial Analysis CEO Micah Hill-Smith. The WSE chip is an interesting evolution in computer chip design. While we've been making integrated circuits since the 1950s and microprocessors since the 1960s, the CPU was the dominant force in computing for decades. Relatively recently, the GPU or graphical processing unit shifted from being an aide for graphics and games to being the critical processing component of choice for AI development. The WSE is not an x86 or ARM architecture but something entirely new that accelerates GPUs, Cerebras chief marketing officers Julie Shin told me. 'This is not an incremental technology,' she added. 'This is another leapfrog moment for chips.'

Chalk Raises $50M Series A to Power AI Inference
Chalk Raises $50M Series A to Power AI Inference

Globe and Mail

time4 days ago

  • Business
  • Globe and Mail

Chalk Raises $50M Series A to Power AI Inference

Chalk, the data platform for AI inference, announced today that it has raised a $50 million Series A at a $500 million valuation. The round was led by Felicis with participation from Triatomic Capital and existing investors General Catalyst, Unusual Ventures, and Xfund. Aydin Senkut, Founder and Managing Partner at Felicis, will join Chalk's board. The capital will be used to accelerate development of Chalk's platform, onboard new customers, and grow its engineering and go-to-market hubs in San Francisco and New York. This press release features multimedia. View the full release here: As AI adoption accelerates, compute is shifting from training to inference to improve predictions, transform customer experiences, and reduce costs. Existing solutions like Databricks and Snowflake solve training data pipelines, and feature stores provide low-latency access to pre-computed data. But these incumbents don't provide a solution for applications that require fresh data, with complex computation, at inference time. Chalk fills a critical gap in the market – inference data pipelines. Chalk's real-time data platform enables customers to make predictions with fresh data at inference time to prevent identity theft, issue instant loans, increase clean energy efficiency, and moderate harmful content. Senkut shared, 'Chalk is poised to become the Databricks of the AI era. It's one of the fastest-growing data companies we've ever seen. The team has fundamentally redefined how data moves through the AI stack, a crucial advancement for chain-of-reasoning models. What's even more remarkable is Chalk's ability to deliver 5-millisecond data pipelines at massive scale - something that, until now, was considered out of reach. We couldn't be more excited to partner with Marc, Elliot, and Andy, who are all repeat technical founders passionate about building infrastructure that delivers an incredible developer experience.' Marc Freed-Finnegan, Chalk Co-Founder and CEO, added, 'We feel incredibly fortunate to have Aydin and Felicis as our partners for the next phase of our growth. We have a shared vision of the future, and we're honored to be part of the cohort of companies they have invested in.' Chalk powers real-time ML across industries including fintech, identity, healthcare, and e-commerce. Companies like Whatnot, Found, Medely, and Iwoca use Chalk as a core infrastructure layer across their business. 'Chalk helps us deliver financial products that are more responsive, more personalized, and more secure for millions of users. It's a direct line from infrastructure to impact,' said Meng Xin Loh, Senior Technical Product Manager, MoneyLion. Chalk has become critical infrastructure for its customers by enabling teams to rapidly operationalize machine learning and AI. At its core, Chalk's Compute Engine empowers teams to write features in pure Python, automatically translating them into high-performance C++ and Rust pipelines to deliver real-time data without complex ETL. Additionally, Chalk's LLM Toolchain unifies structured and unstructured data, offering native vector storage, automated evaluations, and seamless integrations with major LLM providers. Rahul Madduluri, CTO at Doppel, said, "Chalk powers our LLM pipeline, turning complex inputs — HTML, URLs, screenshots — into structured, auditable features. It lets us serve lightweight heuristics up front and rich LLM reasoning deeper in the stack, so we detect threats others miss without compromising speed or precision.' Chalk was co-founded by Freed-Finnegan, Elliot Marx, and Andrew Moreland — veterans of fintech and data infrastructure. After meeting at Stanford, Marx and Moreland solved large-scale data problems at Affirm and Palantir before co-founding Haven Money, acquired by Credit Karma. Before Chalk, Freed-Finnegan helped launch Google Wallet and started Index, acquired by Stripe (it's now called Stripe Terminal). Across these ventures, the team saw how real-time data pipelines enabled entirely new product categories and business models. Fast forward to today — real-time decisions at inference are essential for all modern applications, and Chalk makes that possible. About Chalk Chalk is the data platform for inference, providing critical infrastructure that empowers teams to rapidly operationalize machine learning and AI. The developer-friendly platform consists of a Compute Engine that automatically compiles features into high-performance Rust pipelines without complex ETL, and an LLM Toolchain that seamlessly unifies structured and unstructured data. Chalk powers real-time, low-latency machine learning for the world's leading companies, enabling instant loans, fraud prevention, personalized recommendations, and even clean energy optimization. Founded in 2022 and headquartered in San Francisco, Chalk has raised over $60M from Felicis, General Catalyst, Triatomic Capital, Unusual Ventures, and Xfund.

Chalk Raises $50M Series A to Power AI Inference
Chalk Raises $50M Series A to Power AI Inference

National Post

time4 days ago

  • Business
  • National Post

Chalk Raises $50M Series A to Power AI Inference

Article content Chalk powers real-time decisions for industry leaders Socure, Doppel, and Sunrun Article content SAN FRANCISCO — Chalk, the data platform for AI inference, announced today that it has raised a $50 million Series A at a $500 million valuation. The round was led by Felicis with participation from Triatomic Capital and existing investors General Catalyst, Unusual Ventures, and Xfund. Aydin Senkut, Founder and Managing Partner at Felicis, will join Chalk's board. The capital will be used to accelerate development of Chalk's platform, onboard new customers, and grow its engineering and go-to-market hubs in San Francisco and New York. Article content 'Chalk helps us deliver financial products that are more responsive, more personalized, and more secure for millions of users. It's a direct line from infrastructure to impact,' said Meng Xin Loh, Senior Technical Product Manager, MoneyLion. As AI adoption accelerates, compute is shifting from training to inference to improve predictions, transform customer experiences, and reduce costs. Existing solutions like Databricks and Snowflake solve training data pipelines, and feature stores provide low-latency access to pre-computed data. But these incumbents don't provide a solution for applications that require fresh data, with complex computation, at inference time. Article content Chalk fills a critical gap in the market – inference data pipelines. Chalk's real-time data platform enables customers to make predictions with fresh data at inference time to prevent identity theft, issue instant loans, increase clean energy efficiency, and moderate harmful content. Article content Senkut shared, 'Chalk is poised to become the Databricks of the AI era. It's one of the fastest-growing data companies we've ever seen. The team has fundamentally redefined how data moves through the AI stack, a crucial advancement for chain-of-reasoning models. What's even more remarkable is Chalk's ability to deliver 5-millisecond data pipelines at massive scale – something that, until now, was considered out of reach. We couldn't be more excited to partner with Marc, Elliot, and Andy, who are all repeat technical founders passionate about building infrastructure that delivers an incredible developer experience.' Article content Marc Freed-Finnegan, Chalk Co-Founder and CEO, added, 'We feel incredibly fortunate to have Aydin and Felicis as our partners for the next phase of our growth. We have a shared vision of the future, and we're honored to be part of the cohort of companies they have invested in.' Article content Chalk powers real-time ML across industries including fintech, identity, healthcare, and e-commerce. Companies like Whatnot, Found, Medely, and Iwoca use Chalk as a core infrastructure layer across their business. Article content 'Chalk helps us deliver financial products that are more responsive, more personalized, and more secure for millions of users. It's a direct line from infrastructure to impact,' said Meng Xin Loh, Senior Technical Product Manager, MoneyLion. Article content Chalk has become critical infrastructure for its customers by enabling teams to rapidly operationalize machine learning and AI. At its core, Chalk's Compute Engine empowers teams to write features in pure Python, automatically translating them into high-performance C++ and Rust pipelines to deliver real-time data without complex ETL. Additionally, Chalk's LLM Toolchain unifies structured and unstructured data, offering native vector storage, automated evaluations, and seamless integrations with major LLM providers. Article content Rahul Madduluri, CTO at Doppel, said, 'Chalk powers our LLM pipeline, turning complex inputs — HTML, URLs, screenshots — into structured, auditable features. It lets us serve lightweight heuristics up front and rich LLM reasoning deeper in the stack, so we detect threats others miss without compromising speed or precision.' Article content Chalk was co-founded by Freed-Finnegan, Elliot Marx, and Andrew Moreland — veterans of fintech and data infrastructure. After meeting at Stanford, Marx and Moreland solved large-scale data problems at Affirm and Palantir before co-founding Haven Money, acquired by Credit Karma. Before Chalk, Freed-Finnegan helped launch Google Wallet and started Index, acquired by Stripe (it's now called Stripe Terminal). Across these ventures, the team saw how real-time data pipelines enabled entirely new product categories and business models. Fast forward to today — real-time decisions at inference are essential for all modern applications, and Chalk makes that possible. Article content Chalk is the data platform for inference, providing critical infrastructure that empowers teams to rapidly operationalize machine learning and AI. The developer-friendly platform consists of a Compute Engine that automatically compiles features into high-performance Rust pipelines without complex ETL, and an LLM Toolchain that seamlessly unifies structured and unstructured data. Chalk powers real-time, low-latency machine learning for the world's leading companies, enabling instant loans, fraud prevention, personalized recommendations, and even clean energy optimization. Founded in 2022 and headquartered in San Francisco, Chalk has raised over $60M from Felicis, General Catalyst, Triatomic Capital, Unusual Ventures, and Xfund. Article content Article content Article content Article content Contacts Article content Article content Article content

Ex-Apple Engineers Behind $200M Xnor Deal Launch ElastixAI, Secure $16M To Revolutionize AI Inference Across Devices
Ex-Apple Engineers Behind $200M Xnor Deal Launch ElastixAI, Secure $16M To Revolutionize AI Inference Across Devices

Yahoo

time24-05-2025

  • Business
  • Yahoo

Ex-Apple Engineers Behind $200M Xnor Deal Launch ElastixAI, Secure $16M To Revolutionize AI Inference Across Devices

Seattle-based ElastixAI, founded just months ago by veteran engineers behind Apple's (NASDAQ:AAPL) $200 million acquisition of Xnor, has raised $16 million from top-tier investors, including Bellevue-based capital venture company FUSE. The stealth-mode startup with elite Apple pedigree is quietly tackling one of the most expensive pain points in artificial intelligence deployment: inference, GeekWire reports. Don't Miss: Hasbro, MGM, and Skechers trust this AI marketing firm — 'Scrolling To UBI' — Deloitte's #1 fastest-growing software company allows users to earn money on their phones. The founding team behind ElastixAI is no stranger to cutting-edge AI. According to GeekWire, CEO Mohammad Rastegari was co-founder and chief technology officer of Xnor, which was acquired by Apple in 2020 for its groundbreaking edge-based AI tools. He spent four years at Apple following the acquisition and most recently served as a distinguished scientist at Meta (NASDAQ:META). Rastegari is also an affiliate assistant professor at the University of Washington and spent five years at the Allen Institute for AI, co-founded by the late Microsoft (NASDAQ:MSFT) visionary Paul Allen, GeekWire says. Chief technology officer Saman Naderiparizi, who led hardware engineering at Xnor, was also a senior engineering manager at Apple. He's joined by third co-founder Mahyar Najibi, a former Apple engineer who also spent time at Waymo, Google's self-driving car project, GeekWire reports. Trending: Maker of the $60,000 foldable home has 3 factory buildings, 600+ houses built, and big plans to solve housing — While training AI models gets most of the headlines, inference is where the real-world costs pile up. According to TechTarget, every time a chatbot generates or replies to a question, a recommendation system suggests a new item or a smart device reacts to a real-world prompt, the model is performing inference. These post-deployment processes happen at scale and in real time, often thousands or millions of times a day. That volume drives up compute costs, latency concerns, and energy consumption. GeekWire says that ElastixAI is focused on flexibility and configurability, giving enterprises and hyperscalers the ability to tune their inference infrastructure to specific needs. Whether running on edge devices or large cloud environments, the company's software-centric platform is designed to reduce both compute load and operational cost. 'We saw a gap when it comes to delivering AI inference at scale and at low cost,' Rastegari told GeekWire. The startup remains in stealth, but its positioning puts it in conversation with major players like Nvidia (NASDAQ:NVDA), Coreweave (NASDAQ:CRWV), and inference-focused startups such as and GeekWire $16 million round includes participation from Catapult, Tyche Partners, Liquid 2 Ventures, and DNX Ventures, according to GeekWire. Cameron Borumand, general partner at FUSE, told GeekWire the firm was "thrilled to back the elite technical founders at Elastix to solve the hardest problems around scaling compute and infrastructure in the fast-growing AI inference market." The company is also well-placed within Seattle's booming AI scene. With Apple's growing presence in the region and continued investment in foundational technologies, ElastixAI has access to both talent and infrastructure, according to GeekWire. As AI demand continues to surge, inference platforms are becoming critical infrastructure. ElastixAI's unique combination of flexibility, leadership, and investor confidence may position it as one of the most closely watched stealth startups of the future. Read Next: Nancy Pelosi Invested $5 Million In An AI Company Last Year — Deloitte's fastest-growing software company partners with Amazon, Walmart & Target – Image: Shutterstock Up Next: Transform your trading with Benzinga Edge's one-of-a-kind market trade ideas and tools. Click now to access unique insights that can set you ahead in today's competitive market. Get the latest stock analysis from Benzinga? APPLE (AAPL): Free Stock Analysis Report TESLA (TSLA): Free Stock Analysis Report This article Ex-Apple Engineers Behind $200M Xnor Deal Launch ElastixAI, Secure $16M To Revolutionize AI Inference Across Devices originally appeared on © 2025 Benzinga does not provide investment advice. All rights reserved. Error while retrieving data Sign in to access your portfolio Error while retrieving data Error while retrieving data Error while retrieving data Error while retrieving data

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store