logo
'Inference whales' are eating into AI coding startups' business model

'Inference whales' are eating into AI coding startups' business model

The AI coding sector has a problem.
Heavy users of AI coding services have been racking up huge costs, forcing some leading startups to overhaul their pricing structures and offerings to avoid big losses.
"Inference whales," as some in the business call these customers, are making industry insiders question whether AI products that are just "reselling inference" can survive long-term.
Inference refers to how AI models are run. Newer reasoning models break user requests down into multiple steps, which increases inference costs. When applied to AI coding services, where developers set automated agents to longer-term tasks, expenses can soar quickly.
This is a problem for AI coding services because they're often offered through monthly subscription plans. Many plans allow unlimited use for a fixed monthly fee, and a few users have taken advantage by bombarding the services with huge projects.
These startups must still pay for the underlying AI models, so they're getting squeezed between a relatively fixed revenue stream and rapidly rising backend costs.
"If you're purely reselling AI inference, your business could be very fragile and vulnerable, because the winds can shift violently," said Eric Simons, CEO of StackBlitz, and startup that offers a popular AI coding service called Bolt.new.
Claude Code whales
Anthropic offered its popular Claude Code service through a $200 a month unlimited plan earlier this year. Some subscribers went berserk, using thousands of dollars' worth of AI inference over a few weeks or months.
Someone even built a website to rank these AI coding whales. The Claude Code Leaderboard lists one developer at the top who's burned through almost 11 billion tokens.
Tokens are how AI models break queries down into digestible data chunks. Industry pricing is based on how many tokens are processed. That top-ranked developer's token usage costs almost $35,000, according to this leaderboard.
That compares to the $200 a month he's been charged. Even if that's over a whole year, Anthropic would be getting about $2,400, while incurring much higher inference costs.
Anthropic is changing its pricing
That's clearly unsustainable, so Anthropic plans to change its pricing. The $200 a month plan will stay, but the startup will introduce weekly rate limits, starting August 28.
If users blow through these new weekly rate limits, they will have to buy additional capacity.
"We've identified extreme usage by a small number of customers that impacts capacity for our broader community," an Anthropic spokesperson told Business Insider.
The startup said it's also seen "policy violations," such as account sharing and reselling access.
"We're committed to supporting advanced use cases long-term, but need to ensure consistent performance for all developers in the meantime," the Anthropic spokesperson added.
A Swedish whale
I tracked down one of the whales near the top of the Claude Code Leaderboard.
Albert Örwall, a developer based in Sweden, said he's been using the $200 a month Claude Code subscription to build his own vibe-coding platform, along with some open-source agentic tools.
"I was probably running 3 to 4 fairly long-running tasks in parallel constantly while I was working, and that's when it really took off," he said of his Claude Code usage.
Even excluding these big projects, Örwall said his regular workflow in Claude Code likely racks up inference costs of $500 per day, under a subscription that costs only $200 a month.
"So I'm guessing my workflow might not be sustainable for Anthropic," he added.
Cursor responded, too
When Anthropic's new pricing kicks in, Örwall said he'll keep the $200 a month subscription for a while to get a feel for what the weekly limits actually mean for his budget.
"I'll avoid paying anything beyond the $200 subscription," he said, noting that he can change how he writes code and develops projects to avoid breaching the new rate limits.
"The reason I originally switched from Cursor to Claude Code was because usage-based pricing became too expensive in Cursor," Örwall added.
Cursor is another popular AI coding service, which often uses Anthropic's AI models as the underlying intelligence powering its product.
Cursor recently switched its $20 a month Pro plan from unlimited requests to a tiered system with usage-based pricing for "fast" requests, meaning users are charged extra for exceeding a certain limit.
This change, coupled with a lack of clear communication, caused confusion and frustration among some users who expected unlimited usage.
Cursor announced the initial change in mid-June. Then it updated with more details about 2 weeks later, and then again in early July.
"New models can spend more tokens per request on longer-horizon tasks," the startup wrote in a blog post, apologizing for surprising users with unexpected new bills.
"Though most users' costs have stayed fairly constant, the hardest requests cost an order of magnitude more than simple ones."
Inference costs aren't falling
The assumption across the industry has been that inference costs will drop dramatically, making these AI coding services more financially viable.
However, in practice, this hasn't happened thus far. Instead, when a new top AI model comes out, all the AI coding services integrate it — along with its higher prices.
"This is the first faulty pillar of the 'costs will drop' strategy," Ethan Ding, CEO of startup TextQL, wrote in a recent blog. "Demand exists for 'the best language model,' period. And the best model always costs about the same, because that's what the edge of inference costs today."
Developers and other AI users usually want the best, not last month's leading intelligence.
"Nobody opens Claude and thinks, 'you know what? let me use the shitty version to save my boss some money.' We're cognitively greedy creatures," Ding wrote. "We want the best brain we can get."
Even when inference costs do fall, the rise of agentic AI workflows means that developers set up longer, automated projects that generate a lot more tokens.
If a project uses 100 million tokens, rather than 1 million, the initiative's cost remains high, even if per-token prices may have fallen.
"A $20/month subscription cannot even support a user making a single $1 deep research run a day," Ding said. "But that's exactly what we're racing toward. Every improvement in model capability is an improvement in how much compute they can meaningfully consume."
"There's no way to offer unlimited usage in this new world under any subscription model," he added. "The math has fundamentally broken."
Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Why the latest inflation data gives investors a reason to smile
Why the latest inflation data gives investors a reason to smile

Yahoo

time5 hours ago

  • Yahoo

Why the latest inflation data gives investors a reason to smile

This post originally appeared in the Business Insider Today newsletter. You can sign up for Business Insider's daily newsletter here. Good morning. Ever considered investing in real estate? You might already have a big piece of what you need. "House hacking" is a new strategy homeowners are using to kick-start their rental portfolios. In today's big story, the latest inflation data gave investors a reason to feel upbeat about what's coming next. What's on deck: Markets: Why day traders' summer dominance could be hit with a September chill. Tech: Microsoft is dangling multimillion-dollar offers to poach Meta's AI talent. Business: Taylor Swift used to separate business from her love life. Not anymore. But first, the rally big story A best-case scenario Not too hot, not too cold — this was just right. The latest inflation report struck a good balance, delivering a best-case scenario for the stock market. The S&P 500 closed at a record high on Tuesday, while the Nasdaq rose over 1% and the Dow spiked nearly 500 points. US stock futures are continuing the climb this morning. The consumer price index rose 2.7% year-over-year in July, below economists' expectations of 2.8%. The figures may seem marginal, but for markets, this was the sweet spot. That's largely because it was likely low enough to allow the Federal Reserve to cut rates at its September meeting, BI's William Edwards writes. At the same time, the inflation reading was high enough to ease recession fears that had flared after the disappointing July jobs report, which included significant downward revisions to previous data. (The report rattled more than just economists — Trump promptly fired Bureau of Labor Statistics director Erika McEntarfer after the data was released.) Meanwhile, the latest inflation report opens up more positive possibilities. The CME FedWatch Tool now shows markets seeing 92% odds the Fed cuts rates by 25 basis points next month, up from about 80% on Monday. Higher odds are also now being priced in for cuts in October and December. For Trump, rate cuts can't come soon enough. "Jerome 'Too Late' Powell must NOW lower the rate. Steve 'Manouychin' really gave me a 'beauty' when he pushed this loser. The damage he has done by always being Too Late is incalculable." Writing in a Truth Social post early Tuesday, Trump said that he is also "considering allowing a major lawsuit against Powell to proceed" over the "grossly incompetent" job he's done renovating the Federal Reserve. This is the latest twist in Trump's feud with the Fed Chair, which seems to remain in an uncomfortable phase. The markets, at least, may be entering a brighter one. 3 things in markets 1. America's biggest bank is about to open its new headquarters. JPMorgan Chase's new 60-story skyscraper at 270 Park Avenue is full of high-end perks and amenities. The building includes a "state-of-the-art" gym — which the bank said employees will have to pay a membership fee to access — an Irish pub, AI tech systems, and more. Take a look. 2. A September showdown may be looming. Day traders outperformed professional money managers this summer, but their dominance might not last long. A historically seasonal pullback in retail trading and other headwinds threaten to upend the summer-long rally. 3. Trump's pick for Bureau of Labor Statistics chief suggested pausing US jobs reports. E.J. Antoni, Trump's nominee and chief economist at the Heritage Foundation, floated the idea on Fox News Digital earlier this month, citing accuracy concerns. Economists and market strategists told BI that such a move would be damaging for investors and economic planners. 3 things in tech 1. Microsoft has Meta AI talent in its sights. The software giant has a list of its most-wanted researchers and engineers from Meta and has already begun offering multimillion-dollar pay packages, documents viewed by BI's Ashley Stewart reveal. It's a step to compete with the eye-popping comp Microsoft's rivals are offering in the AI talent wars. 2. AI coding startups have an inference whale problem. Anthropic and Cursor are facing surging costs from a handful of heavy users, which is eating into their business models. As a result, they're introducing tiers or rate limits to what was formerly a basic fixed-price monthly subscription plan. 3. Baconator with a side of AI? Michael Chorey, the executive responsible for building out the AI automation for Wendy's drive-thru, which he says can take orders faster than a human in a headset, is leaving after five years. Chorey exclusively told BI that he is joining Presto, a tech company developing AI-first drive-thrus, which he believes is the next era of fast-food hospitality. 3 things in business 1. To solve the housing crisis, think outside the bounds. Outside the city bounds, that is. Ned Resnikoff argues that connecting cities, towns, and suburbs into large regional governances would make it easier and cheaper to buy a home in the US. Taxes from exclusive enclaves, like Greenwich, Connecticut, or Sausalito, California, would help support nearby cities. 2. Taylor Swift is in her boyfriend era. The pop star hard-launched the title of her newest album, "The Life of a Showgirl," in a teaser clip of her boyfriend Travis Kelce's podcast. It signals a shift in her marketing strategy, where she's putting her S.O. and her relationship front and center. 3. You've heard of quiet quitting, now get ready for quiet cracking. The latest threat to worker engagement is quiet cracking, in which people show up to work and do their jobs but still feel dissatisfied. Some of the warning signs can look like less extreme symptoms of burnout, EY's chief well-being officer told BI. In other news The DIY cage armor in Ukraine keeps getting weirder, wilder — and more 'Mad Max.' Spirit Airlines warns it may not survive another year after huge losses. A former Miss USA and Miss Teen USA thought the Miss Universe CEO's 'blond hair and blue eyes' comment was 'very destructive.' Elon Musk said Apple made it 'impossible' for non-ChatGPT AI apps to top the App Store. DeepSeek would like a word. Senate Democrats say a new crypto bill raises the risk of 'financial meltdown.' What's happening today Harvey Weinstein is sentenced in Manhattan after a jury convicted him on one count of sexual assault in a retrial. Hallam Bullock, senior editor, in London. Grace Lett, editor, in New York. Meghan Morris, bureau chief, in Singapore. Akin Oyedele, deputy editor, in New York. Amanda Yen, associate editor, in New York. Lisa Ryan, executive editor, in New York. Dan DeFrancesco, deputy editor and anchor, in New York (on parental leave). Read the original article on Business Insider Sign in to access your portfolio

A new OpenAI researcher shares the one regret he has from his job interviews
A new OpenAI researcher shares the one regret he has from his job interviews

Business Insider

time5 hours ago

  • Business Insider

A new OpenAI researcher shares the one regret he has from his job interviews

Bas van Opeheusden recently started his job at OpenAI as a technical staff member. He shared 8 pages worth of tips on how to nail the AI job interview. He also shared one regret he has from talking with the recruiter. A researcher who just started at OpenAI is opening up about how he landed the job in such a competitive market — and the one big mistake he made in interviews. Bas van Opheusden joined OpenAI as a technical staff member in July. In an X post on Tuesday, he shared eight pages worth of tips for nailing the AI job interview. The document covers everything from general interview advice to coding tips, and what to expect when negotiating comp. "Some recruiters will play dirty," van Opheusden wrote. "I have had companies give extremely short deadlines, retract offers, ghost me entirely, or 'accidentally' fail to make an offer until after another deadline had expired." He also shared a mistake he made during the introductory call with recruiters. The introduction call is customary for many companies early in the hiring process. Van Opheusden said this is often the time for recruiters to explain basic information like who the hiring manager is, what team you'll be on, and, "for startups, what the company's mission and strategy is." He also wrote that recruiters may ask about compensation expectations. Van Opheusden's advice? "During this call, take notes!" he wrote in bold. It might be the only time someone explains the company's organization chart and team structure, he said. "I've had coding interviews 2-3 weeks later where someone would ask what role I was applying to and I didn't know," he wrote. For note-taking, Van Opheusden prefers having dual monitors so that he can take notes while on the call. "I have a dual-screen setup where I can take notes during a call, and I will move the video call window on my screen so that it looks like I'm making eye contact," he wrote. Van Opheusden didn't immediately respond to a request for comment from Business Insider.

Even CEOs get a do-over now and then. Just ask OpenAI's Sam Altman.
Even CEOs get a do-over now and then. Just ask OpenAI's Sam Altman.

Business Insider

time6 hours ago

  • Business Insider

Even CEOs get a do-over now and then. Just ask OpenAI's Sam Altman.

All hail the new ChatGPT, which is much better than the old ChatGPT, which we're getting rid of. That was the messaging from OpenAI CEO Sam Altman and his team last week. A day later, Altman changed his mind. He told the world the older version of ChatGPT was going to stick around, after all — in addition to the new version that was meant to replace it. I'm not getting into the weeds here about the change and change back, which is confusing for people who use ChatGPT, and impenetrable to non-users. (If you want to, I suggest you head to Business Insider's coverage, or this post from analyst Ben Thompson, for details.) I'm most interested in Altman's incredibly quick pivot. Because I'm having a hard time thinking of a CEO hyping a new product launch, and almost immediately changing course afterward, supposedly because his customers didn't like it. Can you think of one? The most obvious one I can recall is New Coke, which you have to be pretty old to have tried. It only lasted for a few months in 1985, because lots of Old Coke drinkers hated it, and it's now synonymous with Corporate Mega Flops. But it still lasted for a few months — not a single day. And this is different than product flops like the Samsung Galaxy Note 7, which was pulled off the market after a couple of months because some of them exploded. And to be clear — Altman isn't recalling his newest, very high-profile AI engine. It still exists; he's just reversing his call to get rid of the older one. (Business Insider owner Axel Springer has a commercial agreement with OpenAI. And our CEO thinks we should all use AI in our day-to-day work.) It's possible that there are other, yet-to-surface explanations for Altman's change of heart. But so far, the only one he's offered is that he heard from people on Reddit and presumably other places who were upset to lose the versions of the service they already had. If you are an Altman fan, you can paint the episode as a story that shows you how nimble and responsive a Big Tech CEO can be. If you are less generous, you might argue that this was something Altman and his company should have seen coming, and acted accordingly. Either by not budging, and explaining to users that they were wrong, and would learn to love the new tech. Or by not making the move in the first place. Thompson, in his Stratechery newsletter, worries that Altman's quick flip is a sign of a bigger problem — that he's too willing to tell people what they want to hear: The real question for OpenAI is if they are in fact ... just a bit too obsequious and sycophantic. The paradox of successful consumer companies from Apple to Facebook is that they give customers what they want, but they don't ask them; they make decisions and then seek out revealed preference through data, not stated preference on social media. Hopefully OpenAI did that in this case; my concern is that the more realistic explanation is that this is a company that, in the end, can't say "no" to anyone. Maybe! But I think this is probably a pretty small chapter in the OpenAI story — a visible, but ultimately not-that-consequential misstep. Maybe OpenAI really did misjudge its customers. But it was pretty easy to make those customers happy, simply by … not taking something away from them. It also helps that this was a do-over Altman and crew could do with a couple key strokes. There were no devices (yet) to recall, no refunds to issue. In that sense, this reminds me of something closer to a branding or marketing snafu, like a new Gap logo that lasted for 10 days in 2010, or that Kendall Jenner Pepsi ad from 2017 that disappeared after people called it stupid and tone-deaf. Embarrassing screw-ups, but not the first thing you think about when you think about those companies. I myself had forgotten those stories until ChatGPT reminded me of them, when I asked for comparable flip-flops. (I don't use ChatGPT to write my stories, but I definitely find myself using it as a superior version of Google more and more these days.) And yes, if we see more waffling from Altman in the months and years to come, we'll be able to point back to this botched rollout as the start of a pattern. But for now, this one seems like an odd and interesting footnote, and not much else.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store