Latest news with #Baseten

Baseten Launches New Inference Products to Accelerate MVPs into Production Applications

Business Wire

21-05-2025

Business
Business Wire

Baseten Launches New Inference Products to Accelerate MVPs into Production Applications

SAN FRANCISCO--(BUSINESS WIRE)--Baseten, the leader for mission-critical inference, announced the public launch of Baseten Model APIs and the closed beta of Baseten Training today. These new products enable AI teams to seamlessly transition from rapid prototyping to scaling in production, building on Baseten's proprietary inference stack. In recent months, new releases of DeepSeek, Llama, and Qwen models erased the quality gap between open and closed models. Organizations are more incentivized than ever to use open models in their products. Many AI teams have been limited to testing open models at low scale due to insufficient performance, reliability, and economics offered by model endpoint providers. While easy to get started with, the deficiencies of these shared model endpoints have fundamentally gated enterprises' ability to convert prototypes into high-functioning products. Baseten's new products - Model APIs and Training - solve two critical bottlenecks in the AI lifecycle. Both products are built using Baseten's Inference Stack and Inference-optimized Infrastructure, which power inference at scale in production for leading AI companies like Writer, Descript, and Abridge. Using Model APIs, developers can instantly access open-source models optimized for maximum inference performance and cost-efficiency to rapidly create production-ready minimum viable products (MVPs) or test new workloads. 'In the AI market, your number one differentiator is how fast you can move,' said Tuhin Srivastava, co-founder and CEO of Baseten. 'Model APIs give developers the speed and confidence to ship AI features knowing that we've handled the heavy lifting on performance and scale.' Baseten Model APIs enable AI engineers to test open models with a confident scaling story in place from day one. As inference increases, Model APIs customers can easily transfer to Dedicated Deployments that provide greater reliability, performance, and economics at scale. "With Baseten, we now support open-source models like DeepSeek and Llama in Retool, giving users more flexibility for what they can build,' said DJ Zappegos, Engineering Manager at Retool. 'Our customers are creating AI apps and workflows, and Baseten's Model APIs deliver the enterprise-grade performance and reliability they need to ship to production." Customers can also use Baseten's new Training product to rapidly train and tune models, which will result in superior inference performance, quality, and cost-efficiency to further optimize inference workloads. Unlike traditional training solutions that operate in siloed research environments, Baseten Training runs on the same production-optimized infrastructure that powers its inference. This coherence ensures that models trained or fine-tuned on Baseten will behave consistently in production, with no last-minute refactoring. Together, the latest offerings enable customers to get products to market more rapidly, improve performance and quality, and reduce costs for mission-critical inference workloads These launches reinforce Baseten's belief that product-focused AI teams must care deeply about inference performance, cost, and quality. 'Speed, reliability, and cost-efficiency are non-negotiables, and that's where we devote 100 percent of our focus,' said Amir Haghighat, co-founder and CTO of Baseten. 'Our Baseten Inference Stack is purpose-built for production AI because you can't just have one piece work well. It takes everything working well together, which is why we ensure that each layer of the Inference Stack is optimized to work with the other pieces.' 'Having lifelike text-to-speech requires models to operate with very low latency and very high quality,' said Amu Varma, co-founder of Canopy Labs. 'We chose Baseten as our preferred inference provider for Orpheus TTS because we want our customers to have the best performance possible. Baseten's Inference Stack allows our customers to create voice applications that sound as close to human as possible.' Teams can start with a quick MVP and seamlessly scale it to a dedicated, production-grade deployment when needed, without changing platforms. An enterprise can prototype a feature on Baseten Cloud, then graduate to its own private clusters or on-prem deployment (via Baseten's hybrid and self-hosted options) for greater control, performance tuning, and cost optimization, all with the same code and tooling. This 'develop once, deploy anywhere' capability directly results from Baseten's Inference-optimized Infrastructure, which abstracts the complexity of multi-cloud and on-premise orchestration for the user. The news follows on a year of considerable growth for the company. In February, Baseten announced the close of a series C funding round co-led by IVP and Spark and which moved its total amount of venture capital funding to $135 million. It was recently named to Forbes AI 50 2025, a list of the pre-eminent privately held tech companies in AI which also featured a number of companies that Baseten powers 100 percent of the inference for, like Writer and Abridge. About Baseten Baseten is the leader in infrastructure software for high-scale AI products, offering the industry's most powerful AI inference platform. Committed to delivering exceptional performance, reliability, and cost-efficiency, Baseten is on a mission to help the next great AI products scale. Top-tier investors, including IVP, Spark, Greylock, Conviction, Base Case, and South Park Commons back Baseten. Learn more at

'Burn the boats': To stay at the bleeding edge, AI developers are trashing old tech fast

Business Insider

27-04-2025

Business
Business Insider

'Burn the boats': To stay at the bleeding edge, AI developers are trashing old tech fast

It's not uncommon for AI companies to fear that Nvidia will swoop in and make their work redundant. But when it happened to Tuhin Srivastava, he was perfectly calm. "This is the thing about AI — you gotta burn the boats," Srivastava, the cofounder of AI inference platform Baseten, told Business Insider. He hasn't burned his quite yet, but he's bought the kerosene. The story goes back to when DeepSeek took the AI world by storm at the beginning of this year. Srivastava and his team had been working with the model for weeks, but it was a struggle. The problem was a tangle of AI jargon, but essentially, inference, the computing process that happens when AI generates outputs, needed to be scaled up to quickly run these big, complicated, reasoning models. Multiple elements were hitting bottlenecks and slowing down delivery of the model responses, making it a lot less useful for Baseten's customers, who were clamoring for access to the model. Srivastava's company has access to Nvidia's H200 chips — the best, widely available chip that could handle the advanced model at the time — but Nvidia's inference platform was glitching. A software stack called Triton Inference Server was getting bogged down with all the inference required for DeepSeek's reasoning model R1, Srivastava said. So Baseten built their own, which they still use now. Then, in March, Jensen Huang took to the stage at the company's massive GTC conference and launched a new inference platform: Dynamo. Dynamo is open-source software that helps Nvidia chips handle the intensive inference used for reasoning models at scale. "It is essentially the operating system of an AI factory," Huang said onstage. "This was where the puck was going," Srivastava said. And Nvidia's arrival wasn't a surprise. When the juggernaut inevitably surpasses Baseten's equivalent platform, the small team will abandon what they built and switch, Srivastava said. He expects it will take a couple of months max. "Burn the boats." It's not just Nvidia making tools with its massive team and research and development budget to match. Machine learning is constantly evolving. Models get more complex and require more computing power and engineering genius to work at scale, and then they shrink again when those engineers find new efficiencies and the math changes. Researchers and developers are balancing cost, time, accuracy, and hardware inputs, and every change reshuffles the deck. "You cannot get married to a particular framework or a way of doing things," said Karl Mozurkewich, principal architect at cloud firm Valdi. "This is my favorite thing about AI," said Theo Brown, a YouTuber and developer whose company, Ping, builds AI software for other developers. "It makes these things that the industry has historically treated as super valuable and holy, and just makes them incredibly cheap and easy to throw away," he told BI. Browne spent the early years of his career coding for big companies like Twitch. When he saw a reason to start over on a coding project instead of building on top of it, he faced resistance, even when it would save time or money. Sunk cost fallacy reigned. "I had to learn that rather than waiting for them to say, 'No,' do it so fast they don't have the time to block you," Browne said. That's the mindset of many bleeding-edge builders in AI. It's also often what sets startups apart from large enterprises. Quinn Slack, CEO of AI coding platform Sourcegraph, frequently explains this to his customers when he meets with Fortune 500 companies that may have built their first AI round on shaky foundations. " I would say 80% of them get there in an hourlong meeting," he said. The firmer ground is up the stack Ben Miller, CEO of real estate investment platform Fundrise, is building an AI product for the industry, and he doesn't worry too much about the latest model. If a model works for its purpose, it works, and moving up to the latest innovation is unlikely to be worth the engineer's hours. "I'm sticking with what works well enough for as long as I can," he said. That's in part because Miller has a large organization, but it's also because he's building things farther up the stack. That stack consists of hardware at the bottom, usually Nvidia's GPUs, and then layers upon layers of software. Baseten is a few layers up from Nvidia. The AI models, like R1 and GPT-4o, are a few layers up from Baseten. And Miller is just about at the top where consumers are. "There's no guarantee you're going to grow your customer base or your revenue just because you're releasing the latest bleeding-edge feature," Mozurkewich said. "When you're in front of the end-user, there are diminishing returns to moving fast and breaking things."

Tempus AI, Inc. (TEM) Secures $300 Million to Fuel Ambry Genetics Acquisition

Yahoo

22-02-2025

Business
Yahoo

Tempus AI, Inc. (TEM) Secures $300 Million to Fuel Ambry Genetics Acquisition

We recently published a list of . In this article, we are going to take a look at where Tempus AI, Inc. (NASDAQ:TEM) stands against the other AI stocks in news and ratings you should not miss. On February 19, CNBC reported significant funding rounds for multiple AI-focused startups, highlighting the quick expansion of AI innovations across industries like healthcare, cloud computing, and model deployment. Baseten, a San Francisco-based company founded in 2019, raised $75 million at an $825 million valuation to improve its AI model deployment services. Using cloud infrastructure from Amazon and Google, Baseten helps clients access GPUs for AI inference, reducing costs by over 40% while supporting the cost-effective DeepSeek-R1 reasoning model. Its revenue increased sixfold in the last fiscal year, and its clients include over 100 enterprises and companies such as Descript, Patreon, and Writer. OpenEvidence, an AI health-tech startup in Cambridge founded by Daniel Nadler, raised $75 million from Sequoia, bringing its valuation to $1 billion, as per CNBC. The company's AI chatbot, trained on data from The New England Journal of Medicine and peer-reviewed journals, assists doctors with clinical decisions and is already used by a quarter of U.S. physicians. The chatbot avoids inaccuracies through tailored training and has grown rapidly due to word-of-mouth recommendations among doctors. OpenEvidence will also use its new funding to establish partnerships, including one with NEJM Group, and Nadler views the company as a solution to doctor burnout and the projected physician shortfall. Lambda, a cloud computing firm specializing in AI development, raised $480 million in a Series D round co-led by Andra Capital and SGW, reaching a $2.5 billion valuation and total funding of $863 million. Lambda rents out Nvidia GPU-powered servers and offers software to train and deploy AI models, including open-source ones like DeepSeek-R1. CEO Stephen Balaban highlighted Lambda's ability to repurpose its 25,000 GPUs for open-source AI models, which has fueled demand for H200 chips. The company will use the funding to expand its GPU inventory and further develop its software, including its Model Inference API and Chat AI Assistant. Lambda is positioned to meet the surging demand for AI infrastructure and is serving over 5,000 customers across industries such as manufacturing and finance, For this article, we selected AI stocks by reviewing news articles, stock analysis, and press releases. We listed the stocks in ascending order of their hedge fund sentiment taken from Insider Monkey's Q4 database of over 1000 hedge funds. Why are we interested in the stocks that hedge funds pile into? The reason is simple: our research has shown that we can outperform the market by imitating the top stock picks of the best hedge funds. Our quarterly newsletter's strategy selects 14 small-cap and large-cap stocks every quarter and has returned 275% since May 2014, beating its benchmark by 150 percentage points (). Copyright: nexusplexus / 123RF Stock Photo Number of Hedge Fund Holders: 17 Tempus AI, Inc. (NASDAQ:TEM) provides healthcare technology solutions, including diagnostic testing, clinical trial matching, data analytics, and AI-driven platforms for healthcare and pharmaceutical industries. On February 19, Tempus AI, Inc. (NASDAQ:TEM) secured $300 million in additional debt financing from Ares Management's Credit funds to support its acquisition of Ambry Genetics, finalized on February 3, 2025. It brings Ares' total funding for Tempus to approximately $560 million since 2022. Tempus uses AI and data-driven technology to improve clinical care and research and the management emphasized that this investment will help drive innovation in precision medicine. The company views the funding as crucial for advancing its technological solutions and improving patient outcomes across oncology, cardiology, and other medical fields. Douglas Dieter, Dr.P.H., Partner in the Ares Credit Group commented: 'Over the last two years, we've been impressed by the Tempus team's execution of its growth strategy and complementary acquisition of Ambry, and we look forward to further supporting their efforts in AI-enabled solutions that help advancements in medicine.' Overall, TEM ranks 7th on our list of the AI stocks in news and ratings you should not miss. While we acknowledge the potential of TEM as an investment, our conviction lies in the belief that AI stocks hold greater promise for delivering higher returns and doing so within a shorter timeframe. If you are looking for an AI stock that is more promising than TEM but that trades at less than 5 times its earnings, check out our report about the cheapest AI stock. READ NEXT: 20 Best AI Stocks To Buy Now and Complete List of 59 AI Companies Under $2 Billion in Market Cap Disclosure: None. This article is originally published at Insider Monkey. Sign in to access your portfolio

Latest news with #Baseten

Baseten Launches New Inference Products to Accelerate MVPs into Production Applications

'Burn the boats': To stay at the bleeding edge, AI developers are trashing old tech fast

Tempus AI, Inc. (TEM) Secures $300 Million to Fuel Ambry Genetics Acquisition

Get Started Now: Download the App