logo
Google DeepMind CEO says one flaw is holding AI back from reaching full AGI

Google DeepMind CEO says one flaw is holding AI back from reaching full AGI

Business Insider9 hours ago
The one thing keeping AI from full AGI? Consistency, said Google DeepMind CEO Demis Hassabis.
Hassabis said on an episode of the "Google for Developers" podcast published Tuesday that advanced models like Google's Gemini still stumble over problems most schoolkids could solve.
"It shouldn't be that easy for the average person to just find a trivial flaw in the system," he said.
He pointed to Gemini models enhanced with DeepThink — a reasoning-boosting technique — that can win gold medals at the International Mathematical Olympiad, the world's most prestigious math competition.
But those same systems can "still make simple mistakes in high school maths," he said, calling them "uneven intelligences" or "jagged intelligences."
"Some dimensions, they're really good; other dimensions, their weaknesses can be exposed quite easily," he added.
Hassabis's position aligns with Google CEO Sundar Pichai, who has dubbed the current stage of development "AJI" — artificial jagged intelligence. Pichai used this term on an episode of Lex Fridman's podcast that aired in June to describe systems that excel in some areas but fail in others.
Hassabis said solving AI's issues with inconsistency will take more than scaling up data and computing. "Some missing capabilities in reasoning and planning in memory" still need to be cracked, he added.
He said the industry also needs better testing and "new, harder benchmarks" to determine precisely what the models excel at, and what they don't.
Hassabis and Google did not respond to a request for comment from Business Insider.
Big Tech hasn't cracked AGI
Big Tech players like Google and OpenAI are working toward achieving AGI, a theoretical threshold where AI can reason like humans.
Hassabis said in April that AGI will arrive "in the next five to 10 years."
AI systems remain prone to hallucinations, misinformation, and basic errors.
OpenAI CEO Sam Altman had a similar take ahead of last week's launch of GPT-5. While calling his firm's model a significant advancement, he told reporters it still falls short of true AGI.
"This is clearly a model that is generally intelligent, although I think in the way that most of us define AGI, we're still missing something quite important, or many things quite important," Altman said during a press call on Wednesday before the release of GPT-5.
Altman added that one of those missing elements is the model's ability to learn independently.
"One big one is, you know, this is not a model that continuously learns as it's deployed from the new things it finds, which is something that to me feels like AGI. But the level of intelligence here, the level of capability, it feels like a huge improvement," he said.
Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

AI Coding Agents: Driving The Next Evolution In Software Development
AI Coding Agents: Driving The Next Evolution In Software Development

Forbes

timea few seconds ago

  • Forbes

AI Coding Agents: Driving The Next Evolution In Software Development

Vikas Mendhe is a solution architect and digital transformation expert specializing in API-driven solutions in financial technology. As artificial intelligence continues to reshape industries, one of the most significant innovations in the software world is the rise of coding agents. They are reshaping how code is written, tested and maintained, marking a new era in software development. What Are Coding Agents? Coding agents are intelligent systems powered by large language models that write, debug and optimize code. They generate APIs, refactor legacy systems, write tests and even build apps with minimal input. Popular tools include GitHub Copilot, Amazon CodeWhisperer and Tabnine. New-generation assistants such as Cursor, Windsurf (recently acquired by OpenAI) and Cline focus on deeper IDE integration, context retention and developer autonomy. Industry Adoption The adoption of coding agents is gaining momentum across sectors. Tech companies are embedding coding assistants into their workflows, startups are exploring autonomous agents like AutoGPT and Devin for rapid prototyping and governments are integrating them cautiously for tasks like data transformation, compliance automation and internal tool development. While accuracy and oversight concerns remain, the shift toward AI-assisted development is well underway. Language-Specific Strengths Of Popular Coding Agents As coding agents continue to evolve, developers often look for tools that best support the languages they work in. • GitHub Copilot thrives in Python, JavaScript and TypeScript, with robust IDE integration. • Amazon CodeWhisperer specializes in Java, Python and JavaScript, featuring AWS-native tools and cloud focus. • Cursor excels in TypeScript and Python, with built-in memory and pair programming. • Tabnine supports Java, Python, C++ and Go with offline capability and customization. • Claude Code optimizes Shell, Python and Bash for terminal-based tasks. • Devin, a Python-based agent, enables complex, multi-step, end-to-end coding automation. Real-World Case Studies Let's just take a look at GitHub Copilot's applications in the real world. ANZ Bank's 2024 trial of GitHub Copilot showed engineers completing tasks 42% faster with improved code quality. Accenture's enterprise study found Copilot users coding 55% faster, with 90% reporting higher fulfillment. And a 2025 ZoomInfo case study involving over 400 developers reported a 33% code acceptance rate and 72% satisfaction. These findings show coding agents reduce repetitive work and free developers for higher-value tasks. Impact On Software Development Coding agents could transform software development from end to end. For developers, they act as smart copilots, automating repetitive tasks and simplifying complex workflows. Businesses gain faster delivery, lower costs and greater agility, turning ideas into prototypes in days instead of weeks. These tools also democratize development: Non-coders can build apps using natural language, and junior developers can produce better code with minimal oversight. Educational studies confirm this potential. AI code completion tools enhance student productivity and engagement while preserving problem-solving and conceptual learning. Programs such as the Stanford Institute for Human-Centered AI are exploring how such tools support computer science education at scale. Behind The Scenes Of Coding Agents Most coding agents are built on transformer-based LLMs such as OpenAI's Codex and GPT-4. Popular tools like GitHub Copilot and Amazon CodeWhisperer operate through IDE plugins, sending prompts to remote model APIs. GPT-4o mini supports a 128K token context window, enabling broader file-level reasoning. Claude 3.7 Sonnet offers 200K tokens for extended reasoning workflows. Gemini 1.5 Pro surpasses both with a 2M token context, ideal for workflows spanning entire codebases. More autonomous agents, such as AutoGPT and Devin, use frameworks like LangChain to chain prompts, memory and shell commands, completing multi-step engineering tasks with minimal human input. Terminal-Based Coding Agents In parallel, new terminal-based coding agents are emerging to support command-line workflows for professional developers. Tools like Claude Code, Codex CLI and Gemini CLI bring AI-powered development directly into the terminal environment, enabling agents to execute commands, write scripts and interact with live file systems, all while preserving developer autonomy. Coding Agents As A Service Despite advances, coding agents can still produce insecure or low-quality code. Safeguards like validation mechanisms and inline linting help, but human oversight remains essential. Rigorous testing, linting and code reviews should be part of every deployment pipeline. Code Quality, Security And The Role Of Supervision Despite advances, coding agents still generate insecure code and lack deep understanding of intent. Recent advancements have introduced better safeguards, validation mechanisms and inline linting. However, ongoing oversight remains essential. This underscores the need for rigorous testing, linting and human code review pipelines before production deployment. Getting Started With AI Coding Agents Before adopting AI coding agents, focus on clear, high-value use cases and choose tools suited to those needs instead of automating everything. Keep humans in the loop by ensuring AI-generated code undergoes rigorous testing, security scans and peer reviews. Research shows nearly half of developers don't fully trust AI output and often spend extra time debugging it. Be mindful of data privacy, intellectual property and licensing rules to avoid compliance issues, and set governance policies to prevent security blind spots and vendor lock-in. To mitigate common pitfalls—such as inaccurate code, scope creep, security risks and hidden costs—start with structured pilot programs that have measurable outcomes. Enterprise case studies show that successful rollouts often begin with controlled experiments, formal risk assessments and well-defined change management plans. Strong guardrails, clear policies and an ongoing review process help organizations capture productivity gains while maintaining quality and security. Conclusion Coding agents are not meant to replace human developers—they are tools that help make their work faster and easier. As more companies start using them, it's important to find the right mix between automation and human control. When used responsibly, coding agents can help teams work more efficiently, come up with new ideas and change the way software is built in the AI era. Everyone, not just developers, should understand what coding agents can and can't do, especially those shaping the future. Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?

How Will You Be Found When AI Replaces Google Search?
How Will You Be Found When AI Replaces Google Search?

Forbes

timea few seconds ago

  • Forbes

How Will You Be Found When AI Replaces Google Search?

Zac Brandenberg is Co-Founder and CEO of DRINKS, a leading AI-powered SaaS platform transforming the U.S. alcohol market. AI platforms are reshaping how consumers discover content, products and brands—and in the process, reshaping how that content is surfaced. The brands that understand this shift now will be positioned to dominate tomorrow's discovery landscape. An analysis of millions of AI citations reveals that each platform—ChatGPT, Perplexity and Google AI Overviews—operates with fundamentally different algorithms than traditional search engines. For example, Reddit drives 47% of Perplexity citations, whereas Wikipedia commands 48% of ChatGPT references. The Authority Game Has Changed Traditional search prioritized backlinks and domain authority. AI search prioritizes clarity, context and community validation. So, a single mention on Reddit would carry more weight in Perplexity than months of link-building campaigns. ChatGPT is often influenced by first-page results (regardless of engine) because that's what's most available. Perplexity loves community content (like the aforementioned Reddit). Google AI Overviews balances traditional authority with user-generated content from platforms like Quora and LinkedIn. So, where do brand leaders start? Most e-commerce leaders don't know where their brands appear in AI search results. Start with a simple audit across all major platforms, like ChatGPT, Perplexity and Google's AI Overviews. Document which sources get cited and how your brand is positioned. Pay attention to citation patterns. If competitors dominate Wikipedia mentions, you need a presence there. If Reddit discussions drive industry conversations, deploy community managers authentically. Each AI platform rewards different content strategies. Generic optimization won't work. • For ChatGPT Success: Focus on neutral, reference-style content. Build a comprehensive Wikipedia presence. Get featured in established business publications like Forbes and TechCrunch. When ChatGPT searches the web, it favors Bing results, so optimize there too. It's important to note that when ChatGPT does this, it uses Bing as a gateway, but its output is more curated than a basic list of search results. • For Perplexity Visibility: Create short-form video content and engage authentically in relevant Reddit communities. Perplexity values Yahoo and MarketWatch, making expert commentary on these platforms valuable. • For Google AI Overviews: Develop thought leadership on LinkedIn and engage strategically with Quora discussions. Balance authoritative content with community insights. Wineries and alcoholic beverage producers face the challenges of most industries, often with increased intensity due to a hyper-competitive traditional search environment. Low discoverability, hyper-competition with distribution channels and massive competitive saturation in their categories make winning the search war challenging. As a result, moving now to address the AI-search opportunity is even more important. Successful wineries now post harvest updates on r/wine and create YouTube Shorts showing vineyard operations. These authentic community contributions get cited when users search "sustainable winemaking 2025." Lifestyle brands selling wine can engage in broader communities like r/fashion or r/gifting, reaching audiences that traditional wineries never access. A clothing brand discussing "wine and style pairings" captures citations in lifestyle search queries. AI systems extract information differently than human readers do. Complex, creative language that performs well in traditional content marketing can hurt AI visibility. Structure content for immediate comprehension. Use clear statements, numbered lists and quotable information. AI platforms need content they can easily parse and excerpt. We restructured our product descriptions and case studies using this approach. Instead of creative wine metaphors, we use clear, factual statements about our technology capabilities. This improved both AI citations and customer understanding. For a winery, this might mean instead of poetic descriptions like "sun-kissed grapes dancing in morning dew," writing "estate-grown Pinot Noir, 14.2% alcohol, aged 18 months in French oak barrels." AI platforms can extract and cite specific details. The Zero-Click Future AI-generated summaries are creating more zero-click searches, where users get answers without visiting websites. This doesn't eliminate the need for visibility—it makes citation more valuable than clicks. Being cited establishes authority even when users don't visit your site. So, marketing budgets should shift from link-building to community engagement and structured content creation. Invest in Reddit community management, Wikipedia presence and platform-specific content optimization. Double down on short-form video. The convergence is happening fast. Traditional search engines are integrating AI features while pure AI platforms gain market share. By 2028, a study by Semrush indicates that AI search traffic is projected to exceed traditional search engine usage—a significant shift for industries like technology, where discovery drives business development. The brands that adapt quickly to these new discovery rules will be the winners in an AI-first search world. Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?

How AI Can Help Tackle Collective Decision-Making
How AI Can Help Tackle Collective Decision-Making

Harvard Business Review

timea few seconds ago

  • Harvard Business Review

How AI Can Help Tackle Collective Decision-Making

Collective decision-making is hardly a perfect science. Broken processes, data overload, information asymmetry, and other inequities only compound the challenges that come from large, disparate factions with different goals trying to work together. And the tools that often help with decision-making—data analysis, scenario planning, decision trees, and so on—can falter in the face of the scale and complexity of the biggest problems that groups and leaders face. This is where AI can help, and is helping. With its ability to analyze vast troves of data about the status quo, understand group preferences, run sophisticated simulations to evaluate hundreds of future possibilities against those preferences, and facilitate consensus-building among participants, AI can be a powerful tool for all leaders facing complex decisions, especially those that must be made collaboratively. One field already taking advantage of AI's collective decision-making support is city planning. Three years ago, we started working with the United States Conference of Mayors to understand how AI is helping cities solve their most pressing challenges. Along the way, we studied the story of the German city of Hamburg, which has addressed a housing crisis exacerbated by an influx of refugees. In 2016, Hamburg partnered with MIT Media Lab, the creator of an AI platform called CityScope. The platform allows urban planners to collect and digest the needs and preferences of swaths of residents, simulate hundreds of building scenarios, identify hidden opportunities, and find common ground among conflicting factions. By demonstrating how CityScope is working in Hamburg, we hope to show leaders across governments, nonprofits, universities, and corporations how they can harness data and AI to democratize and improve their decision-making processes and outcomes. The Crisis in Hamburg In 2016, Germany decided to welcome 1 million refugees from the Middle East, and Hamburg was tasked with finding housing for an anticipated 80,000 families in a city of under 2 million people. At the time, the city had been stuck in unproductive conversations about zoning laws for decades, struggling to build enough houses for its own residents. Three challenges tend to come to the fore in situations of collective decision-making, and Hamburg was no exception: Processes and incentives are broken. The traditional process to get things done (in this case, to get new housing built) involves dozens of steps and institutions, each with its own procedural logic and internal culture. A single technical impropriety can delay a project by months, if not years. Stakeholders have no incentive to come together. In Hamburg, the city struggled to get anything built because each project required the approval of a dozen bureaucracies, often sclerotic and opaque. Information is increasingly abundant and isn't equally distributed. Every decision (in this case, about specific developments or zoning laws) involves vast amounts of information across domains—from resident preferences and technical documents to traffic and usage metrics. Furthermore, processes are often expressed in long, detailed, technical documents that the average person cannot be expected to understand. Those with more resources have the time, money, and expertise to get the information they need to form an educated opinion while other community members do not. Other inequities. Dominant players possess a wide array of tools to block transformations. In this case, property owners can stop urban developments with historic preservation regulations, minimum lot size requirements, height restrictions, etc. In Hamburg, the only people who participated in decision-making about housing and zoning were wealthier, older homeowners. Successive mayors had launched a few outreach campaigns to get other parts of the community engaged in zoning debates, but none had gained traction. How CityScope Helped AI can help to solve these challenges and improve the way that groups make decisions together. Ariel Noyman, one of the key engineers behind CityScope, told us that his team designed the platform to fulfill four key functions: Insight: Building a dynamic model of social, economic, and environmental conditions through comprehensive data collection, an environmental scan, transaction analytics, and sentiment analysis; visualization with feedback. Prediction: Identifying needs and simulating the impact of alternative interventions by evaluating thousands of 'what-if' scenarios. Transformation: Iterating possible interventions into validated paths of action. Consensus: Engaging stakeholders in a shared, facilitated decision-making process to reach a unified vision of the future. Here's how these functions played out in the process of working with residents and planners in Hamburg to move the needle forward. The first step involved gathering as much data as possible. CityScope drew on data about housing and zoning laws, but also economic development, purchasing patterns, city-wide events and amenities, transportation and infrastructure, employment opportunities, demographic diversity, environmental impact, safety, and more. It also administered surveys to residents to gather their preferences. To deliver the first function, insight, the platform then correlated the core dimensions of housing—density and diversity of people—with performance indicators such as energy use, safety, resident preferences, and so on. With that data, the platform analyzed the relationship between housing and quality of life in the status quo. Then the platform went on to make predictions about current trends and hypothetical transformations, identifying constraints that might be valuable to alter along the way. CityScope highlighted the systematic underuse of commercial properties, for instance. It also highlighted the areas where public services were most likely to be strained and those with the most capacity to welcome new residents. Once the analysis was done, CityScope displayed the results in a simple, easy-to-understand diagram to help citizens and other participants to easily see how potential changes would affect key performance indicators that reflect the common priorities of the city's residents. The labels on the vertical bars indicate that the community cared about environmental impact, energy performance, infrastructure, innovation, and overall livability (each of these indicators aggregated hundreds of metrics). The height of the fill of each bar indicates the city's current performance on each priority (higher is better), and the horizontal lines on each bar indicate the performance for each of these priorities for a given future scenario. The bars with red fill indicate areas that would change for the better; those that are green show the city has already met the targets. Through this chart, CityScope helps residents understand their situation (insight again), extrapolate current trends (prediction again), evaluate possible alternatives (transformation), and find common ground (consensus). In Figure 2, these metrics are aggregated by street into a 3D map of the city. Positive changes are in green, negative ones in red, and alternatives can be modeled at the scale of the individual street, neighborhood, or the city as a whole. In Hamburg, these representations allowed users to understand trade-offs and find ways to overcome them. For example, the visualization made starkly clear the differences between the preferences of affluent communities, where proposals that threatened lower-density zoning received the worst ratings, and the city as a whole, where building more houses in underdeveloped areas to welcome refugees scored well. CityScope then helped the residents find the best way to accommodate these conflicting preferences. By evaluating competing proposals, it demonstrated that wealthier neighborhoods could benefit from more houses provided that a new metro line were also built in the process, thereby paving the way to consensus. The vizualizations also allow CityScope to yet again collect people's preferences, this time on the trade-offs and competing proposals. In Hamburg, the team administered surveys and organized workshops across the city with an augmented reality (AR) version of the platform (see Figure 3). The AR interface allowed participants to collaborate with CityScope to see the implications of their choices. Hamburg residents would come into the room and rearrange the LEGO-like bricks representing residential units, office buildings, parks, and other urban amenities in a specific zone, redesigning the city one brick at a time (Figure 3). When participants made these changes, the digital projection updated in real time to show how the proposed changes would affect quality of life. The platform also connected these changes with the zoning laws that would make them possible, bridging the gap between the LEGO game and policymaking. The interface also allowed participants to collaborate with each other in workshops across the city. That way, CityScope became a platform of direct community engagement, where technical and non-technical people, with different levels of understanding, gathered around the table to understand the impact that their common decisions would have on their city (the consensus function again). In Hamburg, 5,000 residents participated in CityScope workshops in 2016, considerably more than at any conventional town hall. What's more, by targeting diverse communities across the city, the CityScope team has managed to attract a representative sample of the population, rather than just the older, wealthier citizens who have historically participated in the city's urban planning. Making Decision-Making More Democratic Through its use of AI, CityScope addresses the key challenges we identified in city planning, and in group decision-making more generally: First, they circumvent slow bureaucratic processes. By aggregating all the relevant data into a dynamic model, CityScope analyzes trade-offs better than city officials ever could on their own because it integrates all perspectives and tests thousands of alternatives. The result is a considerably streamlined process, and also one that takes more perspectives into account more accurately. Second, they solve the problem of information overload and asymmetry. By intaking and processing vast troves of data and then providing clear visuals and methods for interacting with and sharing them, CityScope removes the informational barriers that favor those with more resources over those who lack money, time, or expertise, giving anyone the opportunity to understand and propose changes. Third, CityScope enables the full community to find a path to consensus, not just the elite. Residents may not agree on this or that housing project, but they can find common ground around shared priorities. By shifting the focus of deliberation from specific projects or laws to broader priorities for the city, CityScope reframes the discussion towards the bigger picture. Larson calls the platform a 'consensus machine' for a reason. Eighteen months after the partnership with CityScope began, Hamburg had not just housed thousands of refugees: It had strategically distributed them across the city to maximize social cohesion, economic opportunity, and community resilience. Since then, the city has been integrating CityScope into its decision-making processes more broadly, for transportation, energy use, and environmental regulation. When Russia invaded Ukraine in 2022, Hamburg had the tools to welcome tens of thousands of refugees in a fraction of the usual time And the United Nations now funds a project exporting CityScope to other cities that face an unexpected influx of refugees. Beyond CityScope Humans are not especially good at processing immense amounts of information and translating it into policy. They struggle to understand complexity and, left to their own devices, seldom find common ground on contentious issues. CityScope shows that AI can help. However, by themselves, platforms like CityScope cannot solve contentious problems. Most group decisions remain inescapably prone to conflict and while AI can help us understand and navigate tradeoffs, it cannot make tradeoffs disappear. No algorithm, however sophisticated, can replace a culture of healthy disagreement and mutual respect. In Hamburg, it was the residents and their leaders who made this conversation productive, not just the platform itself. Further, these tools only matter if they are integrated into the right ecosystem. In Hamburg, the city could not take advantage of CityScope without also reforming bureaucratic hurdles that prevented certain spaces from being repurposed or built upon. CityScope helped identify and prioritize those reforms, but without them, the remainder of its functions would have been futile. While AI can streamline processes and help us make better decisions together, only with the right leadership can these tools translate into collective action. Nevertheless, tools which combine sophisticated simulations with direct engagement can change how we make decisions in all kinds of institutions—cities, but also corporations, universities, or non-profits. Platforms like AnyLogic, FlexSim, and Visual Components are already developing similar tools for corporations, a trend that is likely to accelerate in the years to come. Far from a substitute for human decision-making, these platforms will offer a powerful complement to it: a way to harness data at the service of common goals.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store