logo
Cutting AI Costs: Smart Strategies for Small Business Savings

Cutting AI Costs: Smart Strategies for Small Business Savings

Forbes21 hours ago
If you run a small business, you might already feel the AI pinch: your customer support runs on ChatGPT, your marketing automation uses Claude, and you're paying for Grok's research capabilities and real-time updates. For the average company (or user), those subscriptions can easily hit $300 a month, especially if you're integrating multiple tools into your workflow. That's a serious line item for what's supposed to be affordable technology.
What most don't realize is that these ballooning costs have more to do with the hardware on the backend than they do the software running their workflows. Every time an AI model responds, it triggers a process called inference: the act of generating output from a trained model. Unlike training—which costs a fortune but only happens once—inference occurs billions of times each day and scales with usage. It has become one of the largest ongoing expenses in AI, driving massive, sustained energy demand that fuels the industry's growing power crisis.
For individuals and small business owners, this hidden cost means AI remains incredibly expensive. But that might be about to change. A new cohort of hardware startups—including Positron AI, Groq, Cerebras Systems, and Sambanova Systems—are racing to make inference radically cheaper. If they succeed, AI tools could drop from $300-a-month luxuries to accessible everyday infrastructure for freelancers, educators, retailers, and entrepreneurs.
If Positron and its peers succeed, the $300-a-month AI stack could shrink to $30. It could also be replaced entirely by tools you run yourself, privately and affordably. And that changes who gets to participate in the future of AI.
Among these, Positron has emerged as a favorite choice by some of the world's dominant neocloud providers, gaining investor attention for its unique approach.
'The early benefits of AI are coming at a very high cost – it is expensive and energy-intensive to train AI models and to deliver curated results, or inference, to end users.' DFJ Growth co-founder Randy Glein said. 'Improving the cost and energy efficiency of AI inference is where the greatest market opportunity lies, and this is where Positron is focused.'
Inference Is The New Electricity Bill
In the world of AI economics, inference is like your utility bill: it grows as you grow, and it's never just a one-time fee. Whether you're sending AI-generated emails or running a support chatbot, inference is what keeps the lights on—and right now, that light is powered by Nvidia's premium-priced GPUs.
'Nvidia GPUs have become the backbone of AI infrastructure today according, powering nearly every major inference workload at every major cloud provider. The downside to this, beyond having one $4 trillion company They own the entire inference market, is that they weren't designed with efficiency in mind. They're built for flexibility and optimized for training complex models that require general-purpose chips for multifaceted tasks. And yet, the majority of inference today still runs on Nvidia hardware, leaving the industry with high power usage, steep cloud bills, and limited options for smaller players.'said Mitesh Agrawl CEO of Positron
The Race To Make AI Affordable
Those are exactly the problems Positron, Groq, Cerebras, and Sambanova are solving by building alternatives to the Nvidia tax. And while they all share a common goal—deliver inference infrastructure that slashes energy consumption, improves performance-per-dollar, and gives developers more control—Positron is arguably the most technically ambitious and commercially mature contender in this race.
Founded by systems engineer Thomas Sohmers and compiler expert Edward Kmett, Positron has taken a radically different path from its peers. Instead of building application-specific chips or chasing general-purpose GPUs, Positron bet on field-programmable gate array (FPGAs)—reconfigurable chips optimized for memory efficiency—and used them to build Atlas, an inference-first system designed from the ground up for performance and energy savings.
Atlas delivers 93 percent memory bandwidth utilization (vs. about 30 percent for GPUs), uses 66 percent less energy, and offers 3.5 times better performance per dollar—all while supporting seamless deployment with no code changes. That kind of out-of-the-box compatibility makes it a practical swap for existing cloud or local systems without forcing teams to rewrite their infrastructure from scratch. These gains have landed it major enterprise deployments with Cloudflare, Crusoe, and Parasail.
The company recently raised a $51.6 million Series A led by Valor Equity Partners, Atreides, and DFJ Growth—the very firms that bankrolled SpaceX, Tesla, X, and xAI, some of the world's largest buyers of AI hardware.
Positron is already working on its next-generation system, Titan, built on custom 'Asimov' silicon, which is expected to support models up to 16 trillion parameters with two terabytes of memory per chip—all while running on standard air-cooled racks. That could make high-throughput inference viable in a wider range of environments, from enterprise data centers to sovereign cloud infrastructure.
While others in the field are exploring niche optimizations, Positron is staking a claim to general-purpose inference acceleration—solving for cost and compatibility at scale. But it's not alone.
Other Challengers Redefining The Stack
While Positron is focused on general-purpose inference acceleration, other challengers are tackling different bottlenecks. Groq is optimizing ultra-low-latency inference for large language models (LLM). Its Tensor Streaming Processor (TSP) delivers consistent, repeatable latency, with sub-millisecond response times—enabling a new class of AI tools that respond instantly—without incurring massive cloud costs—and laying the groundwork for local, responsive AI that could eventually be accessible to small businesses.
Cerebras brings an edge-native, security-first perspective. Its modular AI appliances can run powerful models entirely on-site—ideal for defense, critical infrastructure, or industries where cloud deployment isn't an option. Cerebras makes it possible for organizations to deploy advanced AI with a small footprint—something previously only achievable by hyperscalers.
Sambanova is taking a full-stack approach, combining hardware and software to deliver vertically optimized AI systems. Rather than asking businesses to build training pipelines and inference clusters from scratch, they offer a turnkey platform with pre-trained models—essentially packaging AI as an appliance for organizations without a dedicated machine learning (ML) team.
All of these players are on a mission to unlock high-performance inference that doesn't require hyperscaler infrastructure or ballooning cloud costs—opening the door to entirely new economic possibilities.
Why This Matters For Your Mottom Line
If inference becomes cheaper, everything changes. A Shopify seller could train and run a private AI model locally—without relying on costly cloud infrastructure. A solopreneur could fine-tune a sales assistant on years of customer emails, then run it on a $10 chip instead of a $30,000 graphic processing unit (GPU). A tutoring platform could deploy personalized lesson-plan generators without needing a full-time infrastructure team.
This is already happening. Smaller teams are building domain-specific copilots that live inside their own companies' firewalls. Independent consultants are running multi-agent AI workflows from their laptops. This shows that inference costs are a technical problem, but more importantly, they're the gatekeeper to who gets to build with AI.
If Positron and its peers succeed, the $300-a-month AI stack could shrink to $30. It could also be replaced entirely by tools you run yourself, privately and affordably. And that changes who gets to participate in the future of AI.
Nvidia's Grip Might Finally Be Loosening
Today, Nvidia holds a near-monopoly over AI infrastructure—and by extension, who gets to play. Its chips power the vast majority of generative AI systems worldwide, and its ecosystem (CUDA, TensorRT, etc.) makes switching difficult. The result is a pay-to-play system where cost determines access.
But that grip may not hold if companies like Positron, Groq, Cerebras, and Sambanova continue to gain traction and change the economics of AI. By lowering the cost of inference, they're making it possible for smaller teams and individual users to run powerful models without relying on expensive cloud infrastructure.
This shift could have broad implications. Instead of paying hundreds of dollars a month for AI-powered tools, users may soon be able to run custom assistants, automations, and workflows locally—on hardware they control. For small businesses, freelancers, educators, and startups, that means more control, more customization, and a lower barrier to entry—for a fraction of today's cost.
If inference becomes affordable, innovation stops being a privilege and starts becoming infrastructure. Because when you democratize cost, you decentralize control. The next chapter of AI won't be written by whoever builds the biggest model, but by whoever makes it cheap enough to run—and that's how you break a $4 trillion monopoly.
Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Hungry for Yield? How You Should Play This Famous Dividend Aristocrat Stock Now.
Hungry for Yield? How You Should Play This Famous Dividend Aristocrat Stock Now.

Yahoo

time23 minutes ago

  • Yahoo

Hungry for Yield? How You Should Play This Famous Dividend Aristocrat Stock Now.

World-leading fast-food chain McDonald's (MCD) reported a second-quarter earnings victory that helped alleviate investor concerns about the strength of the brand in a tough environment for consumers. With the help of promotional crossovers, new menu items, and worldwide demand, the Golden Arches notched its biggest same-store sales increase in close to two years. Its shares climbed over 2% in early trading on the back of the Street's positive response to the figures. Nevertheless, executives took a cautious line on U.S. low-income consumer trends, where traffic continues under pressure. With inflation-fatigued consumers cutting back on expenditure, McDonald's is relying on value-driven promotions and global expansion to keep the pedal to the floor. The figures come as the overall quick-service restaurant segment fights for share of the market with slowing traffic and changing expenditure trends. More News from Barchart UnitedHealth Stock Soars as Warren Buffett's Berkshire Hathaway Discloses $1.57B Stake Palantir CEO Alex Karp Sees More Gains Ahead With America-Focused Growth Strategy, Calls U.S. The 'Leader of the Free World' Lucid Motors Is Caught in a Tariff Trap. Is LCID Stock More Likely to Hit $1 or $7 in 2025? Our exclusive Barchart Brief newsletter is your FREE midday guide to what's moving stocks, sectors, and investor sentiment - delivered right when you need the info most. Subscribe today! About McDonald's Stock McDonald's has a market cap of $218 billion and has over 40,000 restaurants and franchise locations globally, with about 69 million daily customers. Based in Chicago, Illinois, the $219 billion company dominates the world's quick service restaurant sector with a combination of company-owned restaurants and a massive franchise network in over 100 countries worldwide. Over the past 52 weeks, MCD shares have traded between $271.85 and $326.32, recently changing hands at $304.89, modestly above their yearly lows and trailing the S&P 500's ($SPX) roughly 25% gain over the same period. Its stock has a forward price-earnings (P/E) ratio of 24.72, slightly higher than the restaurant chain average, in line with investor optimism about the pricing power and brand sustainability of the company. The price-sales (P/S) ratio of 8.40 and the price-cash flow ratio (P/CF) of 20.71 indicate a premium multiple, but steady free cash flow (FCF) and resilient margins support the multiple. With over four decades of annual dividend increases, McDonald's is a Dividend Aristocrat, and the stock provides a payout of to 2.29%, with quarterly payments underpinned by a 31.73% profit margin and healthy cash generation. The next dividend payout will come in September. McDonald's Beats on Earnings In Q2 2025, McDonald's reported adjusted EPS of $3.19, beating the $3.15 consensus estimate, and revenue of $6.84 billion, topping the $6.7 billion forecast. Net income rose 11% year-over-year to $2.25 billion, driven by a 3.8% global same-store sales increase, the largest in nearly two years. U.S. comparable sales rose 2.5%, reversing two straight quarters of declines, supported by promotions like the $5 meal deal, the Minecraft movie tie-in, and the launch of McCrispy Chicken Strips. Management expects stronger results in the second half of 2025, citing easier year-over-year comparisons and continued international momentum. International Operated Markets posted 4% same-store sales growth, while International Developmental Licensed Markets surged 5.6%, led by Japan, China, and the UK. Soon after the quarter closed, the comeback of Snack Wraps to a $2.99 promotional pricing level met with 'encouraging' initial response, with franchisees opting for the year-end continuation of the offer. Company executives also highlighted the company's loyalty program, which generated $9 billion in quarterly system-wide sales by members. What Do Analysts Expect for McDonald's Stock? Barchart data shows that MCD stock has a 'Moderate Buy' rating consensus based on the strength of the brand, scale benefits, and pricing opportunities as the primary long-term strengths. Though some analysts have reduced near-term expectations in response to U.S. traffic headwinds, the overall sentiment remains positive. Its average target of $337.43 offers about 10.7% potential upside from the recent close. The top target of $373 offers upside potential of approximately 22%, and the bottom target of $260 indicates possible downside risk of roughly 14.7%. Analysts' primary catalysts for ongoing earnings growth and shareholder returns include international expansion, menu innovation, and digital engagement. On the date of publication, Yiannis Zourmpanos did not have (either directly or indirectly) positions in any of the securities mentioned in this article. All information and data in this article is solely for informational purposes. This article was originally published on

Busy September US corporate bond market expected despite lower rate cut odds
Busy September US corporate bond market expected despite lower rate cut odds

Yahoo

time23 minutes ago

  • Yahoo

Busy September US corporate bond market expected despite lower rate cut odds

By Matt Tracy and Shankar Ramakrishnan (Reuters) -Companies' U.S. dollar bond issuance will likely carry September to one of the heaviest months for investment-grade supply this year, despite more volatility in Treasury yields as hopes for a bigger Federal Reserve interest rate cut were dimmed by recent data that pointed to still-sticky inflation, said bankers and strategists. September has historically averaged roughly $140 billion of investment-grade bond issuance, according to data from Informa Global Markets. But last year set a record for the busiest September with over $172 billion in new deals, as companies rushed to seize on healthy investor appetite for higher yields, according to the IGM data. The latest inflation data this week showed U.S. producer prices surged while consumer prices rose in line with forecasts, in turn leading the market to place lower odds on a substantial interest rate cut from the Federal Reserve next month. But bond bankers expect this September could again tally robust corporate bond volumes despite the high inflation print and a change in the Fed's expected rate-cutting path, as corporate treasurers are not expected to let this sway their planned issuance. "Data pointing to some delay in interest rate cuts probably does not influence corporate bond issuance in September," said Victor Forte, head of IG capital markets and U.S. debt syndicate at New York City-based investment bank Mizuho Americas. "It is traditionally a busy month and is expected to be so again regardless of small changes in spreads (or) yields,' Forte added. Corporate credit spreads, or the premium over Treasuries paid by companies, widened a few basis points on some corporate bonds this week, but they have not moved materially enough to shift company treasurers' September bond issuance plans next month, Forte said. 'Their decision to issue bonds in September hinges more on corporate finance needs than it is trying to predict when the Fed may cut interest rates," he said. Corporate spreads on average moved about 1 bp tighter this week and were last at 77 bps, making them just 3 bps closer to their tightest levels since reaching 74 bps on July 28, 1998, according to ICE BAML index. Bond yields were at 4.94% or 41 bps inside levels they touched in January, the same index data showed. Bond bankers and analysts similarly expect a busy August for IG bond issuance heading into the expected high September volume, even with an expected quiet period in the two weeks before Labor Day. "With expectations for annual IG supply wrapped around $1.5 trillion in future years, you can expect busier calendars as we approach end of summer going forward,' said Kyle Stegemeyer, head of IG debt capital markets and syndicate at Minneapolis-based U.S. Bank. Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

FACT FOCUS: No, taxpayers will not receive new stimulus checks this summer
FACT FOCUS: No, taxpayers will not receive new stimulus checks this summer

Yahoo

time23 minutes ago

  • Yahoo

FACT FOCUS: No, taxpayers will not receive new stimulus checks this summer

Don't splurge just yet. Rumors spread online Friday that the U.S. government will soon be issuing stimulus checks to taxpayers in certain income brackets. But Congress has not passed legislation to authorize such payments, and, according to the IRS, no new stimulus checks will be distributed in the coming weeks. Here's a closer look at the facts. CLAIM: The Internal Revenue Service and the Treasury Department have approved $1,390 stimulus checks that will be distributed to low- and middle-income taxpayers by the end of the summer. THE FACTS: This is false. Taxpayers will not receive new stimulus checks of any amount this summer, an IRS official said. Stimulus checks, also known as economic impact payments, are authorized by Congress through legislation and distributed by the Treasury Department. Republican Sen. Josh Hawley of Missouri last month introduced a bill that would send tax rebates to qualified taxpayers using revenue from tariffs instituted by President Donald Trump. Hawley's bill has not passed the Senate or the House. The IRS announced early this year that it would distribute about $2.4 billion to taxpayers who failed to claim on their 2021 tax returns a Recovery Rebate Credit — a refundable credit for individuals who did not receive one or more COVID-19 stimulus checks. The maximum amount was $1,400 per individual. Those who hadn't already filed their 2021 tax return would have needed to file it by April 15 to claim the credit. The IRS official said there is no new credit that taxpayers can claim. Past stimulus checks have been authorized through legislation passed by Congress. For example, payments during the coronavirus pandemic were made by possible by three bills: the Coronavirus Aid, Relief and Economic Security Act; the COVID-related Tax Relief Act; and the American Rescue Plan Act. In 2008, stimulus checks were authorized in response to the Great Recession through the Economic Stimulus Act. The Treasury Department, which includes the Internal Revenue Service, distributed stimulus payments during the COVID-19 pandemic and the Great Recession. The Treasury's Bureau of the Fiscal Service, formed in 2012, played a role as well during the former crisis. Hawley in July introduced the American Worker Rebate Act, which would share tariff revenue with qualified Americans through tax rebates. The proposed rebates would amount to at minimum $600 per individual, with additional payments for qualifying children. Rebates could increase if tariff revenue is higher than expected. Taxpayers with an adjusted annual gross income above a certain amount — $75,000 for those filing individually — would receive a reduced rebate. Hawley said Americans 'deserve a tax rebate.' 'Like President Trump proposed, my legislation would allow hard-working Americans to benefit from the wealth that Trump's tariffs are returning to this country,' Hawley said in a press release. Neither the Senate nor the House had passed the American Worker Rebate Act as of Friday. It was read twice by the Senate on July 28, the day it was introduced, and referred to the Committee on Finance. ___ Find AP Fact Checks here: Melissa Goldin, The Associated Press

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store