
Agentic AI: Winning In A World That Doesn't Work That Way
Agentic AI is being built on the assumption that the world is a game—one where every decision can be parsed into players, strategies, outcomes, and a final payoff.
This isn't a metaphor. It's code.
In multi-agent reinforcement learning (MARL), agents use Q-functions to estimate the value of actions in a given state to converge toward an optimal policy. MARL underpins many of today's agentic systems.
A Q-function is a mathematical model that tells an AI agent how valuable a particular action is in a given context—essentially, it's a way of learning what to do and when to maximize long-term reward. But 'optimal' depends entirely on the game's structure—what's rewarded and penalized and what constitutes 'success.' Q-learning becomes a hall of mirrors when the world isn't a game. Optimization spirals into exploitation. MARL becomes even more hazardous because agents must not only learn their policies but also anticipate the strategies of others, often in adversarial or rapidly shifting contexts, as seen in systems like OpenAI Five or AlphaStar.
At the heart of agentic AI—AI designed to act autonomously—is a set of training systems built on game theory: multi-agent reinforcement learning, adversarial modeling, and competitive optimization. While tools like ChatGPT generate content based on probability and pattern-matching, agentic AI systems are being built to make autonomous decisions and pursue goals—a shift that dramatically raises both potential and risk.
The problem is that human life doesn't (and more importantly, shouldn't be induced to) work that way. Game theory is a powerful tool for analyzing structured interactions, such as poker, price wars, and Cold War standoffs.
Those are not games. They are stories. And storytelling isn't ornamental—it's structural. We are, as many have argued, not just homo sapiens but homo narrans: the storytelling species. Through narrative, we encode memory, make meaning, extend trust, and shape identity. Stories aren't how we escape uncertainty—they're how we navigate it. They are the bridge between information and action, between fact and value.
To train machines to optimize for narrow wins inside rigid systems is to ignore the central mechanism by which humans survive uncertainty: We don't game the future—we narrate our way through it.
And training agents to 'win' in an environment with no final state isn't just shortsighted—it's dangerous.
Game theory assumes a closed loop:
Simon Sinek famously argued that business is an 'infinite game.' But agentic AI doesn't play infinite games—it optimizes finite simulations. The result is a system with power and speed, but lacking intuition for context collapse. Even John Nash, the father of equilibrium theory, understood its fragility. His later work acknowledged that real-life decision-making is warped by psychology, asymmetry, and noise. We've ignored that nuance.
But in real life—especially in business—the players change, the rules mutate, and the payoffs are subjective. Even worse, the goals themselves evolve mid-game.
In AI development, reinforcement learning doesn't account for that. It doesn't handle shifting values. It handles reward functions. So, we get agents trained to pursue narrow, static goals in an inherently fluid and relational environment. That's how you get emergent failures—agents exploiting loopholes, corrupting signals, or spiraling into self-reinforcing error loops.
We're not teaching AI to think.
We're teaching it to compete in a hallucinated tournament.
This is the crux: humans are not rational players in closed systems.
We don't maximize. We mythologize.
Evolution doesn't optimize like machines do—it tolerates failure, ambiguity, and irrationality as long as the species endures. It is selected not just for survival and cooperation but also for story-making because narrative is how humans make sense of uncertainty. People don't start companies or empires solely to 'win.' We often do it to be remembered. We blow up careers to protect pride. We endure pain to fulfill prophecy. These are not strategies—they're spiritual motivations. And they're either illegible or invisible to machine learning systems that see the world as a closed loop of inputs and rewards.
We pursue status, signal loyalty, perform identity, and court ruin—sometimes on purpose.
You can simulate 'greed' or 'dominance' by tweaking rewards, but these are surface-level proxies. As Stuart Russell notes, the appearance of intent is not intent. Machines do not want—they merely weigh.
When agents start interacting under misaligned or rigid utility functions, the system doesn't stabilize. It fractures. Inter-agent error cascades, opaque communications, and emerging instability are the hallmarks of agents trying to navigate a reality they were never built to understand.
Imagine a patient sitting across from a doctor with a series of ambiguous symptoms—fatigue, brain fog, and minor chest pain. The patient has a family history of heart disease, but their test results are technically 'within range.' Nothing triggers a hard diagnostic threshold. An AI assistant, trained on thousands of cases and reward-optimized for diagnostic accuracy, might suggest no immediate intervention—maybe recommend sleep, hydration, and follow-up in six months.
The physician, though, hesitates. Not because of data but because of tone, posture, and eye contact, because the patient reminds them of someone, because something feels off, even though it doesn't compute.
So, the doctor ordered the CT scan against the algorithm's advice. They find the early-stage arterial blockage. They save the patient's life.
Why did the doctor do it? Not because the model predicted it. Because humans don't operate on probability alone—we act on a hunch, harm avoidance, pattern distortion, and story. We're trained not only to optimize for outcomes but also to prevent regret.
A system trained to 'win' would have scored itself ideally. It followed the rules. But perfect logic in an imperfect world doesn't make you intelligent—it makes you brittle.
The fundamental flaw in agentic AI isn't technical—it's conceptual. It's not that the systems don't work; they're working with the wrong metaphor.
We didn't build these agents to think. We built them to play. We didn't build agents for reality. We built them for legibility.
Game theory became the scaffolding because it provided a structure, offering bounded rules, rational actors, and defined outcomes. It gave engineers something clean to optimize. But intelligence doesn't emerge from structure; it arises from adaptation within disorder.
The gamification of our informational matrix isn't neutral. It's an ideological architecture that recodes ambiguity as inefficiency and remaps agency into pre-scored behavior. This isn't just a technical concern—it's an ethical one. As AI systems embed values through design, the question becomes: whose values?
In the wild, intelligence isn't about winning. It's about not disappearing. It's about adjusting your behavior when the ground shifts under you because it will. No perfect endgames exist in nature, business, politics, and human relationships; they are just survivable next steps.
Agentic AI, trained on games, expects clarity. But the world doesn't offer clarity. It offers pressure. And pressure doesn't reward precision—it rewards persistence.
This is the failure point. We're asking machines to act intelligently inside a metaphor never built to explain real life. We simulate cognition in a sandbox while the storm rages outside its walls.
If we want beneficial machines, we need to abandon the myth of the game and embrace the truth of the environment: open systems, shifting players, evolving values. Intelligence isn't about control. It's about adjustment, not the ability to dominate, but the ability to remain.
While we continue to build synthetic minds to win fictional games, the actual value surfaces elsewhere: in machines that don't need to want. They need to move.
Mechanized labor—autonomous systems in logistics, agriculture, manufacturing, and defense—isn't trying to win anything. It's trying to function. To survive conditions. To optimize inputs into physical output. There's no illusion of consciousness—just a cold, perfect feedback loop between action and outcome.
Unlike synthetic cognition, mechanized labor solves problems the market understands: how to scale without hiring, operate in unstable environments, and cut carbon and cost simultaneously. Companies like John Deere are already deploying autonomous tractors that don't need roads or road signs. Amazon has doubled its robotics fleet in three years. These machines aren't trying to win. They're trying not to break.
And that's why capital is quietly pouring into it.
The next trillion-dollar boom won't be in artificial general intelligence. It'll be in autonomous physicality. The platforms we think of as background are about to become intelligent actors in their own right. 'We have become tools of our tools,' wrote Thoreau in 'Walden' in 1854, just when the industrial revolution began to transform not just Concord, but America, Europe, and the world.
Intriguingly, Thoreau includes mortgage and rent as 'modern tools' to which we voluntarily enslave ourselves. What Thoreau was pointing to with his experiment in the woods was how our infrastructure, the material conditions of our existence, comes to seem to us 'natural' and inevitable, and that we may be sacrificing more than we realize to maintain that infrastructure. AI - intelligent, autonomous tools - represents a categorical shift in how we coexist with our infrastructure.
Infrastructure isn't just how we move people, goods, and data. It's no longer just pipes, power, and signals. It's 'thinking' now—processing, predicting, even deciding on our behalf. What was once physical has fused with the informational. The external world and our internal systems of meaning are no longer separate. That merger isn't just technical—it's existential. And the implications? We're not ready.
But if AI is to become all of our closest, most intimate companions, we should be clear on what it is, exactly, that we have trained it, and allowed it, to do. This isn't just logistics. It's the emergence of an industrial nervous system. And it doesn't need to 'win.' It needs to scale, persist, and adapt—without narrative.
We're building agentic AI to simulate our most performative instincts while ignoring our most fundamental one: persistence.
The world isn't a game. It's a fluid network of shifting players, incomplete information, and evolving values. To train machines as if it's a fixed competition is to misunderstand the world and ourselves.
We are increasingly deputizing machines to answer questions we haven't finished asking, shaping a world that feels more like croquet with the Queen of Hearts in Alice's Adventures in Wonderland: a game rigged in advance, played for stakes we don't fully understand.
If intelligence is defined by adaptability, not perfection, endurance becomes the ultimate metric. What persists shapes. What bends survives. We don't need machines that solve perfect problems. We need machines that function under imperfect truths.
The future isn't about agentic AI that beats us at games we made up. It's about agentic AI that can operate in the parts of the world we barely understand—but still depend on.

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles
Yahoo
15 minutes ago
- Yahoo
i3 Verticals Inc (IIIV) Q3 2025 Earnings Call Highlights: Strong Revenue Growth and Strategic ...
Release Date: August 08, 2025 For the complete transcript of the earnings call, please refer to the full earnings call transcript. Positive Points i3 Verticals Inc (NASDAQ:IIIV) reported a 12% revenue growth in the third quarter, with SaaS revenue growing at 24% compared to the prior year. The company has a strong balance sheet with over $50 million in net cash and a revolving credit facility with a capacity of up to $400 million. i3 Verticals Inc (NASDAQ:IIIV) successfully divested its healthcare revenue cycle management business, allowing it to focus on its public sector software business. The company is investing in AI to enhance product features and streamline product development, which has resulted in significant user time savings and initial incremental revenue. The company has a strong pipeline for potential acquisitions, focusing on the public sector vertical, and maintains a disciplined M&A process. Negative Points The company expects increased costs in the fourth quarter due to scaling up people costs in preparation for future revenue opportunities. Non-recurring software license sales are less predictable and can distort seasonality, as seen with a high sales quarter in FY 2024 that is not expected to repeat. The company is facing margin compression in the fourth quarter due to incremental investments in the justice tech sector. There is a wide range of outcomes for the fourth quarter guidance, with potential deceleration in organic revenue growth due to strong license sales in the previous year. The company is still in the process of integrating AI solutions across its product lines, which may require further investment and development time. Q & A Highlights Warning! GuruFocus has detected 10 Warning Signs with IIIV. Q: Jeff, regarding the guidance, you didn't tighten the range. Is the midpoint the right way to think about the implied 4Q guide? A: Yes, that's exactly right. Just focus on the midpoint since we weren't trying to move it up or down; we just reiterated it. - Jeff Smith, CFO Q: It seems like organic revenue growth will decelerate due to a strong license quarter last year. Is that the main reason for the deceleration? A: Yes, the deceleration is primarily due to the strong license sales last year, particularly related to our utilities deal. We don't expect a similar occurrence this Q4 unless something unexpected happens. - Jeff Smith, CFO Q: Can you elaborate on the justice tech incremental investments and the revenue opportunities you see in 2026 and beyond? A: The justice tech space is performing well, and we're competing on larger deals. The investments are recent and responsive to opportunities that have arisen this year. We expect about $700,000 in incremental expenses in Q4. - Jeff Smith, CFO Q: Greg, with the business now focused on public sector software, which areas are you most excited about for growth? A: All our sub-verticals are performing well, including education, utilities, and transportation. I believe 2026 will be spectacular, with none of our verticals lagging. - Greg Daly, CEO Q: How often do you partner with other firms for deals, especially for statewide deals? A: We can handle most aspects of deals ourselves, but we partner with integration firms about 1 in 5 or 6 times, more often in the transportation sector. - Paul Christians, Chief Revenue Officer For the complete transcript of the earnings call, please refer to the full earnings call transcript. This article first appeared on GuruFocus.
Yahoo
15 minutes ago
- Yahoo
Docebo Inc (DCBO) Q2 2025 Earnings Call Highlights: Strong Mid-Market Performance and Strategic ...
Release Date: August 08, 2025 For the complete transcript of the earnings call, please refer to the full earnings call transcript. Positive Points Docebo Inc (NASDAQ:DCBO) reported strong performance in the mid-market segment, particularly in technology, healthcare, and financial services sectors. The company achieved FedRAMP certification earlier than expected, unlocking a $2.7 billion TAM across US federal, state, and local agencies. Docebo Inc (NASDAQ:DCBO) saw an increase in customer count above $100,000, with a 23% growth rate, driven by new logos and expansions. The launch of Harmony, an AI-first learning platform, is expected to enhance customer experience and administrative efficiency. The company reported a significant expansion with a large tech customer, highlighting its ability to displace internal systems and legacy vendors. Negative Points There is a noted decrease in the percentage of new customers using Docebo Inc (NASDAQ:DCBO) for two or more use cases, down from 70-80% last year to 65% this year. Elongated sales cycles in the enterprise space continue to be a challenge, affecting deal closures. The company anticipates a dip in retention in Q4 due to the loss of AWS as a customer. ARR growth is expected to slow in Q3 due to seasonal factors and the impact of the AWS contract ending. Despite the FedRAMP certification, meaningful revenue contributions from the federal sector are not expected until the second half of 2026. Q & A Highlights Warning! GuruFocus has detected 2 Warning Sign with DCBO. Q: Could you unpack the strength in the mid-market during the quarter and its durability? A: Alessio Artuffo, CEO, explained that Docebo has strengthened its position in the mid-market through targeted marketing and leadership improvements. The technology sector, healthcare, and financial services have been particularly strong. The company expects this strength to continue, especially as it aligns with enterprise cycles in the second half of the year. Q: The adoption rate for two or more use cases has decreased. What has changed this year compared to last year? A: Alessio Artuffo noted that the company is optimizing sales velocity and ACV by initially entering organizations with fewer use cases and expanding over time. This approach has proven effective, as seen with a notable enterprise customer this quarter. Q: Can you discuss the recent big tech expansion and the decision behind displacing an internal system? A: Alessio Artuffo confirmed the expansion with a big tech customer, emphasizing the importance of enterprise capabilities and integrability. The customer chose Docebo to scale their learning infrastructure, moving away from an in-house system to Docebo's platform. Q: What are the assumptions behind the guidance bump, and is FedRAMP included? A: Brandon Farber, CFO, explained that the guidance reflects strong mid-market performance, elongated enterprise sales cycles, and FX tailwinds. Improvements in retention are expected, but FedRAMP and large enterprise deals remain outside the guidance. Q: Can you provide insights into the FedRAMP certification and its expected impact? A: Alessio Artuffo highlighted that FedRAMP opens a $2.7 billion TAM in the government sector. The company is cautiously optimistic about the federal market, with meaningful contributions expected in the second half of 2026. For the complete transcript of the earnings call, please refer to the full earnings call transcript. This article first appeared on GuruFocus. Sign in to access your portfolio


Bloomberg
18 minutes ago
- Bloomberg
Humanoid Robots Still Lack AI Technology, Unitree CEO Says
Artificial intelligence technology to get humanoid robots into the mainstream remains a key challenge for the sector, according to the founder of one of China's prominent robot developers. The level of expertise could be reached in as little as one to three years, said Wang Xingxing, chief executive officer of Hangzhou Unitree Technology Co Ltd. He likened the environment now to the ChatGPT breakout moment in 2022 when OpenAI's chatbot became an instant hit and made AI a household term.