Hidden costs in AI deployment: Why Claude models may be 20-30% more expensive than GPT in enterprise settings

Business Mayor01-05-2025

It is a well-known fact that different model families can use different tokenizers. However, there has been limited analysis on how the process of 'tokenization' itself varies across these tokenizers. Do all tokenizers result in the same number of tokens for a given input text? If not, how different are the generated tokens? How significant are the differences?
In this article, we explore these questions and examine the practical implications of tokenization variability. We present a comparative story of two frontier model families: OpenAI's ChatGPT vs Anthropic's Claude. Although their advertised 'cost-per-token' figures are highly competitive, experiments reveal that Anthropic models can be 20–30% more expensive than GPT models.
As of June 2024, the pricing structure for these two advanced frontier models is highly competitive. Both Anthropic's Claude 3.5 Sonnet and OpenAI's GPT-4o have identical costs for output tokens, while Claude 3.5 Sonnet offers a 40% lower cost for input tokens.
Source: Vantage
Despite lower input token rates of the Anthropic model, we observed that the total costs of running experiments (on a given set of fixed prompts) with GPT-4o is much cheaper when compared to Claude Sonnet-3.5.
Why?
The Anthropic tokenizer tends to break down the same input into more tokens compared to OpenAI's tokenizer. This means that, for identical prompts, Anthropic models produce considerably more tokens than their OpenAI counterparts. As a result, while the per-token cost for Claude 3.5 Sonnet's input may be lower, the increased tokenization can offset these savings, leading to higher overall costs in practical use cases.
Read More Railcar Studio unveils Invaders AR mobile RPG
This hidden cost stems from the way Anthropic's tokenizer encodes information, often using more tokens to represent the same content. The token count inflation has a significant impact on costs and context window utilization.
Domain-dependent tokenization inefficiency
Different types of domain content are tokenized differently by Anthropic's tokenizer, leading to varying levels of increased token counts compared to OpenAI's models. The AI research community has noted similar tokenization differences here. We tested our findings on three popular domains, namely: English articles, code (Python) and math.
Domain Model Input GPT Tokens Claude Tokens % Token Overhead English articles 77 89 ~16% Code (Python) 60 78 ~30% Math 114 138 ~21%
% Token Overhead of Claude 3.5 Sonnet Tokenizer (relative to GPT-4o) Source: Lavanya Gupta
When comparing Claude 3.5 Sonnet to GPT-4o, the degree of tokenizer inefficiency varies significantly across content domains. For English articles, Claude's tokenizer produces approximately 16% more tokens than GPT-4o for the same input text. This overhead increases sharply with more structured or technical content: for mathematical equations, the overhead stands at 21%, and for Python code, Claude generates 30% more tokens.
This variation arises because some content types, such as technical documents and code, often contain patterns and symbols that Anthropic's tokenizer fragments into smaller pieces, leading to a higher token count. In contrast, more natural language content tends to exhibit a lower token overhead.
Beyond the direct implication on costs, there is also an indirect impact on the context window utilization. While Anthropic models claim a larger context window of 200K tokens, as opposed to OpenAI's 128K tokens, due to verbosity, the effective usable token space may be smaller for Anthropic models. Hence, there could potentially be a small or large difference in the 'advertised' context window sizes vs the 'effective' context window sizes.
Read More How Sonic Rumble is taking Sega into mobile games | interview
GPT models use Byte Pair Encoding (BPE), which merges frequently co-occurring character pairs to form tokens. Specifically, the latest GPT models use the open-source o200k_base tokenizer. The actual tokens used by GPT-4o (in the tiktoken tokenizer) can be viewed here.
JSON { #reasoning "o1-xxx": "o200k_base", "o3-xxx": "o200k_base", # chat "chatgpt-4o-": "o200k_base", "gpt-4o-xxx": "o200k_base", # e.g., gpt-4o-2024-05-13 "gpt-4-xxx": "cl100k_base", # e.g., gpt-4-0314, etc., plus gpt-4-32k "gpt-3.5-turbo-xxx": "cl100k_base", # e.g, gpt-3.5-turbo-0301, -0401, etc. }
Unfortunately, not much can be said about Anthropic tokenizers as their tokenizer is not as directly and easily available as GPT. Anthropic released their Token Counting API in Dec 2024. However, it was soon demised in later 2025 versions.
Latenode reports that 'Anthropic uses a unique tokenizer with only 65,000 token variations, compared to OpenAI's 100,261 token variations for GPT-4.' This Colab notebook contains Python code to analyze the tokenization differences between GPT and Claude models. Another tool that enables interfacing with some common, publicly available tokenizers validates our findings.
The ability to proactively estimate token counts (without invoking the actual model API) and budget costs is crucial for AI enterprises.
Anthropic's competitive pricing comes with hidden costs:
While Anthropic's Claude 3.5 Sonnet offers 40% lower input token costs compared to OpenAI's GPT-4o, this apparent cost advantage can be misleading due to differences in how input text is tokenized.
While Anthropic's Claude 3.5 Sonnet offers 40% lower input token costs compared to OpenAI's GPT-4o, this apparent cost advantage can be misleading due to differences in how input text is tokenized. Hidden 'tokenizer inefficiency':
Anthropic models are inherently more verbose . For businesses that process large volumes of text, understanding this discrepancy is crucial when evaluating the true cost of deploying models.
Anthropic models are inherently more . For businesses that process large volumes of text, understanding this discrepancy is crucial when evaluating the true cost of deploying models. Domain-dependent tokenizer inefficiency:
When choosing between OpenAI and Anthropic models, evaluate the nature of your input text . For natural language tasks, the cost difference may be minimal, but technical or structured domains may lead to significantly higher costs with Anthropic models.
When choosing between OpenAI and Anthropic models, . For natural language tasks, the cost difference may be minimal, but technical or structured domains may lead to significantly higher costs with Anthropic models. Effective context window:
Due to the verbosity of Anthropic's tokenizer, its larger advertised 200K context window may offer less effective usable space than OpenAI's 128K, leading to a potential gap between advertised and actual context window.
Anthropic did not respond to VentureBeat's requests for comment by press time. We'll update the story if they respond.

Hashtags

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

This ChatGPT ‘memory hack' changes everything — use these prompts to make it remember you

Tom's Guide

29 minutes ago

Tom's Guide

This ChatGPT ‘memory hack' changes everything — use these prompts to make it remember you

If you've ever found yourself reintroducing yourself, your tone preferences or even something as basic as your name to ChatGPT, you're not alone. While OpenAI's chatbot is famous for being smart, helpful and shockingly conversational, it's not always great at remembering what matters to you, which is why you might want to teach it. The good news is that ChatGPT actually has a memory feature that's smarter than most people realize, and there are a few tricks you can use to make it remember exactly what you want. Here's how to unlock ChatGPT's memory, plus a few sneaky hacks to get it acting like a real personal assistant. ChatGPT's memory is designed to remember helpful facts about you including your name, your job, your writing style and even your goals. Knowing these things can help the AI tailor responses over time. Think of it like your AI assistant building a mental file cabinet with your preferences inside. OpenAI first rolled out memory to GPT-4o users in early 2024, and now it's automatically turned on for most ChatGPT Plus users. But unless you're actively using it, or customizing it, you might not get the full benefit. To check if memory is on, go to Settings → Personalization → Memory. From there, you can view, edit or wipe everything ChatGPT has remembered about you. One of the simplest ways to store a fact in ChatGPT's memory is to literally prompt it to remember something. For example: Get instant access to breaking news, the hottest reviews, great deals and helpful tips. If memory is enabled, ChatGPT will usually respond with: 'Got it. I'll remember that for future chats.' If not, it may ask for permission to store that information. I've noticed that when I've asked ChatGPT to remember things, it doesn't always remember the first time. Sometimes, not even the second time. If you run into this problem, stay persistent and keep reminding ChatGPT to remember something until it actually does example, I once used ChatGPT Vision to help my mom match fabric for a project. From then on, ChatGPT thought I was a quilter. I had to tell the chatbot to forget that (much to my mom's dismay, I'm sure).Pro tip: You can also say 'Forget that' or 'Update my memory' if something changes — like your job or preferred tone. If you want to be sure it does not remember something, you can also use the temporary chat feature. Even if memory is off or you aren't a Plus subscriber, you can still simulate long-term memory using what I call the context chaining trick. Here's how: Start your prompt with: 'For the rest of this conversation, assume I'm a second grade teacher working on an end-of-year project for my students.' This doesn't persist across sessions, but it works surprisingly well for one-off tasks or multi-step projects. OpenAI makes it easy to see what ChatGPT has remembered — and yes, you should check it occasionally. Just type: 'What do you remember about me?' It'll respond with a summary of the info it has on file, like: 'You're a mom of three who juggles work, parenting and writing with a good sense of humor. You're no stranger to trampoline parks, fourth grade homework chaos or PTA drama. You're based in New Jersey, drive a Jeep and sometimes test AI tools for personal life (like IEP meetings or canceling gym memberships).' Here are a few fast rules to get the most out of ChatGPT's memory: ChatGPT won't magically know your preferences unless you teach it, but with memory, it can get surprisingly close. Whether you're writing a novel, planning a trip or just tired of repeating yourself, these memory hacks can turn ChatGPT into a genuinely helpful sidekick.

For the love of God, stop calling your AI a co-worker

TechCrunch

an hour ago

TechCrunch

For the love of God, stop calling your AI a co-worker

Generative AI comes in many forms. Increasingly, though, it's marketed the same way: with human names and personas that make it feel less like code and more like a co-worker. A growing number of startups are anthropomorphizing AI to build trust fast — and soften its threat to human jobs. It's dehumanizing, and it's accelerating. I get why this framing took off. In today's upside-down economy, where every hire feels like a risk, enterprise startups — many emerging from the famed accelerator Y Combinator — are pitching AI not as software but as staff. They're selling replacements. AI assistants. AI coders. AI employees. The language is deliberately designed to appeal to overwhelmed hiring managers. Some don't even bother with subtlety. Atlog, for instance, recently introduced an 'AI employee for furniture stores' that handles everything from payments to marketing. One good manager, it gloats, can now run 20 stores at once. The implication: you don't need to hire more people — just let the system scale for you. (What happens to the 19 managers it replaces is left unsaid.) Consumer-facing startups are leaning into similar tactics. Anthropic named its platform 'Claude' because it's a warm, trustworthy-sounding companion for a faceless, disembodied neural net. It's a tactic straight out of the fintech playbook where apps like Dave, Albert, and Charlie masked their transactional motives with approachable names. When handling money, it feels better to trust a 'friend.' The same logic has crept into AI. Would you rather share sensitive data with a machine learning model or your bestie Claude, who remembers you, greets you warmly, and almost never threatens you? (To OpenAI's credit, it still tells you you're chatting with a 'generative pre-trained transformer.') But we're reaching a tipping point. I'm genuinely excited about generative AI. Still, every new 'AI employee' has begun to feel more dehumanizing. Every new 'Devin' makes me wonder when the actual Devins of the world will push back on being abstracted into job-displacing bots. Generative AI is no longer just a curiosity. Its reach is expanding, even if the impacts remain unclear. In mid-May, 1.9 million unemployed Americans were receiving continued jobless benefits — the highest since 2021. Many of those were laid-off tech workers. The signals are piling up. Techcrunch event Save now through June 4 for TechCrunch Sessions: AI Save $300 on your ticket to TC Sessions: AI—and get 50% off a second. Hear from leaders at OpenAI, Anthropic, Khosla Ventures, and more during a full day of expert insights, hands-on workshops, and high-impact networking. These low-rate deals disappear when the doors open on June 5. Exhibit at TechCrunch Sessions: AI Secure your spot at TC Sessions: AI and show 1,200+ decision-makers what you've built — without the big spend. Available through May 9 or while tables last. Berkeley, CA | REGISTER NOW Some of us still remember 2001: A Space Odyssey. HAL, the onboard computer, begins as a calm, helpful assistant before turning completely homicidal and cutting off the crew's life support. It's science fiction, but it hit a nerve for a reason. Last week, Anthropic CEO Dario Amodei predicted that AI could eliminate half of entry-level white-collar jobs in the next one to five years, pushing unemployment as high as 20%. 'Most [of these workers are] unaware that this is about to happen,' he told Axios. 'It sounds crazy, and people just don't believe it.' You could argue that's not comparable to cutting off someone's oxygen, but the metaphor isn't that far off. Automating more people out of paychecks will have consequences, and when the layoffs increase, the branding of AI as a 'colleague' is going to look less clever and more callous. The shift toward generative AI is happening regardless of how it's packaged. But companies have a choice in how they describe these tools. IBM never called its mainframes 'digital co-workers.' PCs weren't 'software assistants'; they were workstations and productivity tools. Language still matters. Tools should empower. But more and more companies are marketing something else entirely, and that feels like a mistake. We don't need more AI 'employees.' We need software that extends the potential of actual humans, making them more productive, creative, and competitive. So please stop talking about fake workers. Just show us the tools that help great managers run complex businesses. That's all anyone is really asking for.

Here's how Uber's product chief uses AI at work — and one tool he's going to use next

Business Insider

2 hours ago

Business Insider

Here's how Uber's product chief uses AI at work — and one tool he's going to use next

Uber's chief product officer has one AI tool on his to-do list. In an episode of "Lenny's Podcast" released on Sunday, Uber's product chief, Sachin Kansal, shared two ways he is using AI for his everyday tasks at the ride-hailing giant and how he plans to add NotebookLM to his AI suite. Kansal joined Uber eight years ago as its director of product management after working at cybersecurity and taxi startups. He became Uber's product chief last year. Kansal said he uses OpenAI's ChatGPT and Google's Gemini to summarize long reports. "Some of these reports, they're 50 to 100 pages long," he said. "I will never have the time to read them." He said he uses the chatbots to acquaint himself with what's happening and how riders are feeling in Uber's various markets, such as South Africa, Brazil, and Korea. The CPO said his second use case is treating AI like a research assistant, because some large language models now offer a deep research feature. Kansal gave a recent example of when his team was thinking about a new driver feature. He asked ChatGPT's deep research mode about what drivers may think of the add-on. "It's an amazing research assistant and it's absolutely a starting point for a brainstorm with my team with some really, really good ideas," the CPO said. In April, Uber's CEO, Dara Khosrowshahi, said that not enough of his 30,000-odd employees are using AI. He said learning to work with AI agents to code is "going to be an absolute necessity at Uber within a year." Uber did not immediately respond to a request for comment from Business Insider. Kansal's next tool: NotebookLM On the podcast, Kansal also highlighted NotebookLM, Google Lab's research and note-taking tool, which is especially helpful for interacting with documents. He said he doesn't use the product yet, but wants to. "I know a lot of people who have started using it, and that is the next thing that I'm going to use," he said. "Just to be able to build an audio podcast based on a bunch of information that you can consume. I think that's awesome," he added. Kansal was referring to the "Audio Overview" feature, which summarizes uploaded content in the form of two AIs having a voice discussion. NotebookLM was launched in mid-2023 and has quickly become a must-have tool for researchers and AI enthusiasts. Andrej Karpathy, Tesla's former director of AI and OpenAI cofounder, is among those who have praised the tool and its podcast feature. "It's possible that NotebookLM podcast episode generation is touching on a whole new territory of highly compelling LLM product formats," he said in a September post on X. "Feels reminiscent of ChatGPT. Maybe I'm overreacting."