
Optimizing AI apps in a million-token world
In recent months, models like GPT-4.1, LLaMA 4, and DeepSeek V3 have reached context windows ranging from hundreds of thousands to millions of tokens. We're entering a phase where entire documents, threads, and histories can fit into a single prompt. It marks real progress—but it also brings new questions about how we structure, pass, and prioritize information.
WHAT IS CONTEXT SIZE (AND WHY WAS IT A CHALLENGE)?
Context size defines how much text a model can process in one go, and is measured in tokens, which are small chunks of text, like words or parts of words. It shaped the way we worked with LLMs: splitting documents, engineering recursive prompts, summarizing inputs—anything to avoid truncation.
Now, models like LLaMA 4 Scout can handle up to 10 million tokens, and DeepSeek V3 and GPT-4.1 go beyond 100K and 1M respectively. With those capabilities, many of those older workarounds can be rethought or even removed.
FROM BOTTLENECK TO CAPABILITY
This progress unlocks new interaction patterns. We're seeing applications that can reason and navigate across entire contracts, full Slack threads, or complex research papers. These use cases were out of reach not long ago. However, just because models can read more does not mean they automatically make better use of that data.
The paper ' Why Does the Effective Context Length of LLMs Fall Short? ' examines this gap. It shows that LLMs often attend to only part of the input, especially the more recent or emphasized sections, even when the prompt is long. Another study, ' Explaining Context Length Scaling and Bounds for Language Models,' explores why increasing the window size does not always lead to better reasoning. Both pieces suggest that the problem has shifted from managing how much context a model can take to guiding how it uses that context effectively.
Think of it this way: Just because you can read every book ever written about World War I doesn't mean you truly understand it. You might scan thousands of pages, but still fail to retain the key facts, connect the events, or explain the causes and consequences with clarity.
What we pass to the model, how we organize it, and how we guide its attention are now central to performance. These are the new levers of optimization.
CONTEXT WINDOW ≠ TRAINING TOKENS
A model's ability to accept a large context does not guarantee that it has been trained to handle it well. Some models were exposed only to shorter sequences during training. That means even if they accept 1M tokens, they may not make meaningful use of all that input.
This gap affects reliability. A model might slow down, hallucinate, or misinterpret input if overwhelmed with too much or poorly organized data. Developers need to verify if the model was fine tuned for long contexts, or simply adapted to accept them.
WHAT CHANGES FOR ENGINEERS
With these new capabilities, developers can move past earlier limitations. Manual chunking, token trimming, and aggressive summarization become less critical. But this does not remove the need for data prioritization.
Prompt compression, token pruning, and retrieval pipelines remain relevant. Techniques like prompt caching help reuse portions of prompts to save costs. Mixture-of-experts (MoE) models, like those used in LLaMA 4 and DeepSeek V3, optimize compute by activating only relevant components.
Engineers also need to track what parts of a prompt the model actually uses. Output quality alone does not guarantee effective context usage. Monitoring token relevance, attention distribution, and consistency over long prompts are new challenges that go beyond latency and throughput.
IT IS ALSO A PRODUCT AND UX ISSUE
For end users, the shift to larger contexts introduces more freedom—and more ways to misuse the system. Many users drop long threads, reports, or chat logs into a prompt and expect perfect answers. They often do not realize that more data can sometimes cloud the model's reasoning.
Product design must help users focus. Interfaces should clarify what is helpful to include and what is not. This might mean offering previews of token usage, suggestions to refine inputs, or warnings when the prompt is too broad. Prompt design is no longer just a backend task, but rather part of the user journey.
THE ROAD AHEAD: STRUCTURE OVER SIZE
Larger context windows open important doors. We can now build systems that follow extended narratives, compare multiple documents, or process timelines that were previously out of reach.
But clarity still matters more than capacity. Models need structure to interpret, not just volume to consume. This changes how we design systems, how we shape user input, and how we evaluate performance.
The goal is not to give the model everything. It is to give it the right things, in the right order, with the right signals. That is the foundation of the next phase of progress in AI systems.

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles

Miami Herald
an hour ago
- Miami Herald
Michael Hiltzik: Say farewell to the AI bubble, and get ready for the crash
Most people not deeply involved in the artificial intelligence frenzy may not have noticed, but perceptions of AI's relentless march toward becoming more intelligent than humans, even becoming a threat to humanity, came to a screeching halt Aug. 7. That was the day when the most widely followed AI company, OpenAI, released GPT-5, an advanced product that the firm had long promised would put competitors to shame and launch a new revolution in this purportedly revolutionary technology. As it happened, GPT-5 was a bust. It turned out to be less user-friendly and in many ways less capable than its predecessors in OpenAI's arsenal. It made the same sort of risible errors in answering users' prompts, was no better in math (or even worse), and not at all the advance that OpenAI and its chief executive, Sam Altman, had been talking up. "The thought was that this growth would be exponential," says Alex Hanna, a technology critic and co-author (with Emily M. Bender of the University of Washington) of the indispensable new book "The AI Con: How to Fight Big Tech's Hype and Create the Future We Want." "Instead, Hanna says, "We're hitting a wall." The consequences go beyond how so many business leaders and ordinary Americans have been led to expect, even fear, the penetration of AI into our lives. Hundreds of billions of dollars have been invested by venture capitalists and major corporations such as Google, Amazon and Microsoft in OpenAI and its multitude of fellow AI labs, even though none of the AI labs has turned a profit. Public companies have scurried to announce AI investments or claim AI capabilities for their products in the hope of turbocharging their share prices, much as an earlier generation of businesses promoted themselves as "dot-coms" in the 1990s to look more glittery in investors' eyes. Nvidia, the maker of a high-powered chip powering AI research, plays almost the same role as a stock market leader that Intel Corp., another chip-maker, played in the 1990s - helping to prop up the bull market in equities. If the promise of AI turns out to be as much of a mirage as dot-coms did, stock investors may face a painful reckoning. The cheerless rollout of GPT-5 could bring the day of reckoning closer. "AI companies are really buoying the American economy right now, and it's looking very bubble-shaped," Hanna told me. The rollout was so disappointing that it shined a spotlight on the degree that the whole AI industry has been dependent on hype. Here's Altman, speaking just before the unveiling of GPT-5, comparing it with its immediate predecessor, GPT-4o: "GPT-4o maybe it was like talking to a college student," he said. "With GPT-5 now it's like talking to an expert - a legitimate PhD-level expert in anything any area you need on demand ... whatever your goals are." Well, not so much. When one user asked it to produce a map of the U.S. with all the states labeled, GPT-5 extruded a fantasyland, including states such as Tonnessee, Mississipo and West Wigina. Another prompted the model for a list of the first 12 presidents, with names and pictures. It only came up with nine, including presidents Gearge Washington, John Quincy Adama and Thomason Jefferson. Experienced users of the new version's predecessor models were appalled, not least by OpenAI's decision to shut down access to its older versions and force users to rely on the new one. "GPT5 is horrible," wrote a user on Reddit. "Short replies that are insufficient, more obnoxious ai stylized talking, less 'personality' … and we don't have the option to just use other models." (OpenAI quickly relented, reopening access to the older versions.) The tech media was also unimpressed. "A bit of a dud," judged the website Futurism and Ars Technica termed the rollout "a big mess." I asked OpenAI to comment on the dismal public reaction to GPT-5, but didn't hear back. None of this means that the hype machine underpinning most public expectations of AI has taken a breather. Rather, it remains in overdrive. A projection of AI's development over the coming years published by something called the AI Futures Project under the title "AI 2027" states: "We predict that the impact of superhuman AI over the next decade will be enormous, exceeding that of the Industrial Revolution." The rest of the document, mapping a course to late 2027 when an AI agent "finally understands its own cognition," is so loopily over the top that I wondered whether it wasn't meant as a parody of excessive AI hype. I asked its creators if that was so, but haven't received a reply. One problem underscored by GPT-5's underwhelming rollout is that it exploded one of the most cherished principles of the AI world, which is that "scaling up" - endowing the technology with more computing power and more data - would bring the grail of artificial general intelligence, or AGI, ever closer to reality. That's the principle undergirding the AI industry's vast expenditures on data centers and high-performance chips. The demand for more data and more data-crunching capabilities will require about $3 trillion in capital just by 2028, in the estimation of Morgan Stanley. That would outstrip the capacity of the global credit and derivative securities markets. But if AI won't scale up, most if not all that money will be wasted. As Bender and Hanna point out in their book, AI promoters have kept investors and followers enthralled by relying on a vague public understanding of the term "intelligence." AI bots seem intelligent, because they've achieved the ability to seem coherent in their use of language. But that's different from cognition. "So we're imagining a mind behind the words," Hanna says, "and that becomes associated with consciousness or intelligence. But the notion of general intelligence is not really well-defined." Indeed, as long ago as the 1960s, that phenomenon was noticed by Joseph Weizenbaum, the designer of the pioneering chatbot ELIZA, which replicated the responses of a psychotherapist so convincingly that even test subjects who knew they were conversing with a machine thought it displayed emotions and empathy. "What I had not realized," Weizenbaum wrote in 1976, "is that extremely short exposures to a relatively simple computer program could induce powerful delusional thinking in quite normal people." Weizenbaum warned that the "reckless anthropomorphization of the computer" - that is, treating it as some sort of thinking companion - produced a "simpleminded view of intelligence." That tendency has been exploited by today's AI promoters. They label the frequent mistakes and fabrications produced by AI bots as "hallucinations," which suggests that the bots have perceptions that may have gone slightly awry. But the bots "don't have perceptions," Bender and Hanna write, "and suggesting that they do is yet more unhelpful anthropomorphization." The general public may finally be cottoning on to the failed promise of AI more generally. Predictions that AI will lead to large-scale job losses in creative and STEM fields (science, technology, engineering and math) might inspire feelings that the whole enterprise was a tech-industry scam from the outset. Predictions that AI would yield a burst of increased worker productivity haven't been fulfilled; in many fields, productivity declines, in part because workers have to be deployed to double-check AI outputs, lest their mistakes or fabrications find their way into mission-critical applications - legal briefs incorporating nonexistent precedents, medical prescriptions with life-threatening ramifications and so on. Some economists are dashing cold water on predictions of economic gains more generally. MIT economist Daron Acemoglu, for example, forecast last year that AI would produce an increase of only about 0.5% in U.S. productivity and an increase of about 1% in gross domestic product over the next 10 years, mere fractions of the AI camp's projections. The value of Bender's and Hanna's book, and the lesson of GPT-5, is that they remind us that "artificial intelligence" isn't a scientific term or an engineering term. It's a marketing term. And that's true of all the chatter about AI eventually taking over the world. "Claims around consciousness and sentience are a tactic to sell you on AI," Bender and Hanna write. So, too, is the talk about the billions, or trillions, to be made in AI. As with any technology, the profits will go to a small cadre, while the rest of us pay the price ... unless we gain a much clearer perception of what AI is, and more importantly, what it isn't. Copyright (C) 2025, Tribune Content Agency, LLC. Portions copyrighted by the respective providers.
Yahoo
3 hours ago
- Yahoo
BMBL vs. META: Which Social Connection Stock Offers Better Upside?
In the evolving landscape of social connectivity platforms, Bumble BMBL and Meta Platforms META stand as two distinctive players addressing the fundamental human need for connection. Bumble, the dating and social networking platform founded in 2014, operates through its flagship app alongside Badoo and Bumble For Friends, serving approximately four million paying users globally. Meta Platforms, the social media giant formerly known as Facebook, commands an impressive 3.48 billion daily active users across its family of apps, including Facebook, Instagram, WhatsApp, Messenger, and the rapidly growing Threads operating in different segments of the social connection ecosystem, both companies face a critical juncture in 2025. Bumble is undergoing a comprehensive strategic reset under returning founder-CEO Whitney Wolfe Herd, implementing a 30% workforce reduction and pivoting toward AI-driven quality improvements. Meanwhile, Meta Platforms is aggressively investing in artificial intelligence and its next-generation Llama 4 models, with capital expenditures expected to reach $66-$72 billion in 2025. The timing makes this comparison particularly relevant as investors evaluate which social connection stock offers superior growth delve deep and closely compare the fundamentals of the two stocks to determine which one is a better investment now. The Case for BMBL Bumble's investment thesis centers on its ongoing transformation and potential for operational improvement. The company recently appointed Kevin Cook as CFO in August 2025, bringing more than 30 years of financial management experience from companies like Cloudera and Barracuda Networks. This leadership change, combined with founder Whitney Wolfe Herd's return as CEO, signals a renewed focus on strategic execution. The company has identified $40 million in annual cost savings through restructuring efforts, removing $100 million from its cost base while streamlining operations. Despite facing headwinds with second-quarter revenues declining 8% year over year to $248 million, Bumble maintained adjusted EBITDA margins of 38%, demonstrating resilient profitability during its company's strategic pivot from quantity to quality represents both opportunity and challenge. Bumble is rebuilding its technology stack with AI-first principles, integrating trust and safety features, including phone and ID verification systems. Management has categorized users into three segments — Approve, Improve, and Remove — focusing on enhancing experiences for high-quality members. Full-price payers increased quarter over quarter, now representing 80% of total payers compared to 70% in the first quarter. The company plans significant product launches in August 2025 and February 2026, emphasizing innovative features that could differentiate it in the competitive dating app landscape. However, Bumble faces considerable challenges, including declining paying users, with third-quarter 2025 guidance projecting revenues between $240 million and $248 million, representing a 9-12% year-over-year decrease. Competition from Match Group's portfolio remains intense, and the company must prove that its turnaround strategy can reignite sustainable growth. Bumble Inc. Price and Consensus Bumble Inc. price-consensus-chart | Bumble Inc. Quote The Case for META Meta Platforms presents a compelling growth story powered by robust advertising revenues and transformative AI investments. The company delivered exceptional second-quarter results with revenues of $47.52 billion, up 22% year over year, dramatically exceeding analyst expectations of $44.80 billion. Earnings per share reached $7.14 while net income surged 36% to $18.34 billion. This performance strength extends into the third quarter of 2025, with guidance projecting revenues between $47.5 billion and $50.5 billion, representing 17-24% growth. Meta Platforms' advertising business, generating 98% of total revenues, continues demonstrating remarkable resilience with ad revenues reaching $46.6 billion, benefiting from AI-driven improvements that increased ad conversions by 5% on Instagram and 3% on Platforms' strategic positioning in AI represents a significant competitive advantage, backed by substantial investment, with 2025 capital expenditures expected between $66 billion and $72 billion. The company established Meta Superintelligence Labs and is advancing its Llama 4 model series, with Scout and Maverick variants already released and the powerful Behemoth model still in training. Meta AI is on track to become the world's most used AI assistant, already reaching nearly 600 million monthly active users. The Threads platform continues gaining momentum with 350 million monthly active users. Additionally, Meta's Ray-Ban smart glasses are gaining traction, contributing to Reality Labs' $370 million in second-quarter revenues. While Reality Labs posted a $4.53 billion operating loss, the company's core business strength easily absorbs these strategic investments while maintaining impressive 43% operating margins. Meta Platforms, Inc. Price and Consensus Meta Platforms, Inc. price-consensus-chart | Meta Platforms, Inc. Quote Valuation and Price Performance Comparison The valuation divergence between these stocks reflects their different growth trajectories. Bumble trades at a significant discount of 21.75 P/E ratio. Despite the discounted valuation, investor sentiment remains cautious with a Zacks Rank #3 (Hold) rating. Meta Platforms commands a premium valuation justified by superior growth metrics. The stock trades at approximately 25.98 P/E. You can see the complete list of today's Zacks #1 Rank (Strong Buy) stocks here. BMBL vs. META: P/E F12M Ratio Image Source: Zacks Investment Research Year to date, META shares have gained approximately 25.6%, significantly outperforming broader indices. The company's quarterly dividend of 52 cents per share and $50 billion buyback authorization further enhance shareholder returns. Bumble shares have declined 22.6% in the same time frame. BMBL Underperforms META YTD Image Source: Zacks Investment Research Conclusion While Bumble trades at an attractive discount and shows potential for operational improvement under new leadership, Meta Platforms holds superior upside potential. Meta Platforms' dominant market position, exceptional 22% revenue growth, and leadership in AI development through Llama 4 models create multiple growth catalysts. The company's advertising business remains remarkably resilient, generating strong cash flows, funding aggressive AI investments while maintaining 43% operating margins. META's diversified platform ecosystem provides multiple expansion avenues. Despite premium valuation, META's proven execution, technological advantages, and financial strength position it for sustained outperformance. Investors should actively track META stock for attractive entry points while adopting a wait-and-see approach with Bumble until clearer evidence emerges of successful turnaround execution. Meta Platforms stock carries a Zacks Rank #3 at present. Want the latest recommendations from Zacks Investment Research? Today, you can download 7 Best Stocks for the Next 30 Days. Click to get this free report Bumble Inc. (BMBL) : Free Stock Analysis Report Meta Platforms, Inc. (META) : Free Stock Analysis Report This article originally published on Zacks Investment Research ( Zacks Investment Research Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data


Forbes
6 hours ago
- Forbes
The Billion-Dollar Company Of One Is Coming Faster Than You Think
OpenAI CEO Sam Altman recently revealed that he believes the first $1 billion solo startup will be built by one person with a laptop, an internet connection, and an army of AI agents. I have experienced GPT-5, and I now believe that this milestone will arrive significantly earlier than most people project, likely by 2028. The image of the billionaire founder is about to be rewritten. No sprawling teams. No Ivy League pedigrees. No nine-figure venture war chests. The next mega empire will be operated behind a kitchen table and directed by an individual strong builder with an army of AI agents, no-code automation and the capacity to disseminate concepts around the world in real-time. The silos that required separate teams dedicated to research, marketing, sales, and customer support are being replaced by programs that operate 24/7 in the cloud. Five converging forces Five converging forces are making this not just possible, but inevitable. The first is the emergence of agentic AI: autonomous and focused programs that have the ability to guide research, produce content, reach out to customers, conduct analytics, and even sell, all without downtime, burnout, or red tape. The second is the fusion of no-code tools with API orchestration, enabling entrepreneurs to assemble fully functional, end-to-end business systems in days rather than months. The third is the reach of global digital distribution. Platforms, such as X, LinkedIn, and YouTube, are no longer content hosts, but services that offer all users direct access to in-depth knowledge of the dynamics of attention, reaching tens of millions to hundreds of millions of people. The fourth is the arrival of mass-market AI-powered micro-monetization, subscription-based pricing, and AI-only-delivered services that can scale without the overhead of human labor costs. And finally, there is the falling cost of intelligence itself. 'Thinking work' is becoming as cheap as cloud storage—an economic inversion that changes the rules of productivity entirely. Rapid Change will Continue The trend is already visible. Sam Altman has openly predicted that the first one-person billion-dollar company is on the horizon, and after the recent GPT-5 reveal, that prediction feels less like futurism and more like a weather forecast. Tech leaders and investors are quietly adjusting their expectations. Others, such as Dario Amodei of Anthropic, think that such a reality may come as soon as 2026. According to Carta data, less than a third of newly formed startups are led by solo founders, nearly double the rate from 2015, due to the availability and effectiveness of AI tools. The irony is that the most challenging part of building this kind of company will not be the building itself. As AI agents become increasingly capable, spinning up an operational stack will be trivial. The actual choke point will be sales: making yourself heard, building confidence and turning interest into dollars. A New Inevitable Trajectory for Founders This new breed of founder will follow a trajectory that feels almost inevitable. They will begin by creating an audience and introducing an AI-powered service that appeals to a specific community. That service will be refined into a product, supported by a growing network of AI agents, until they dominate a niche. From there, expansion into new verticals will follow, along with API integrations and enterprise deals. Over time, their operation will evolve from a service to a platform and, ultimately, into an ecosystem that others build upon. That journey may sound ambitious, but it is increasingly within reach of anyone willing to work with both urgency and focus. The building blocks, such as automation, global distribution, and monetization models, are already in place. What's missing in most cases is not the means to execute, but the willingness and work ethic to move with the speed and precision necessary. The founder who can master both will not only create a successful business, but they will also reset the playbook for being an entrepreneur. The Era of Solo-preneuership Indeed, in practice, we are already getting early balloons of this shift. Replit's CEO calls it 'vibe-coding,' where applications are built in hours through natural-language prompts rather than weeks of manual coding. Hobbyists are becoming product creators overnight. Meanwhile, researchers are outlining the 'Solo Revolution' theory, which shows how AI lowers the barriers to entrepreneurship through skill augmentation and radically reduces resource requirements. The most important insight to keep in mind is that we are not replacing human judgment with AI; we are augmenting it. Taste, vision and the ability to motivate others are purely human attributes, as automation eliminates the mechanical component of these qualities. The solo founder who wields AI agents with discernment will have the leverage once reserved for Fortune 500 companies. This is why the first $1 B solo startup is no longer a thought experiment. It is the logical endpoint of the trends we are currently observing. Code is abundant. Intelligence is cheap. Distribution is limitless. The deciding factor will be the nerve to act before the rest of the world catches on. And when that moment comes, it will not be won by the person with the 'best code.' It will belong to the founder who knows what matters, who can see clearly where others hesitate, and who has the audacity to press 'launch' before the ink on their idea is dry. Grit, optimism, and a laptop are all they'll need. Everything else will be handled by an army of agents. Not only is solo-preneurship not a fringe model in the age of AI, it is the ideal model. By 2028, it will not be a surprise when a one-person billion-dollar company emerges. The real surprise will come when the rest of the world tries to follow suit and how fast it will be.