Google's Gemini AI panics while playing Pokémon, takes 800 hours to finish game
Artificial intelligence has made remarkable strides, but Google's latest chatbot is showing that even the smartest machines can crumble under pressure. A recent report by Google DeepMind reveals that its flagship model, Gemini 2.5 Pro, displayed signs of panic while playing Pokémon Blue—an old-school video game many children breeze through with ease.
The findings came from a Twitch channel called Gemini_Plays_Pokemon, where independent engineer Joel Zhang put Gemini to the test. While Gemini is known for its advanced reasoning abilities and code-level understanding, its performance during this gaming challenge exposed unexpected behavioural quirks.
Also read: 40-year-old man dies of cancer after doctors told him stomach ache was due to stress
According to the DeepMind team, Gemini began to exhibit what they describe as 'Agent Panic.' The report states, 'Over the course of the playthrough Gemini 2.5 Pro gets into various situations which cause the model to simulate 'panic'. For example, when the Pokémon in the party's health or power points are low, the model's thoughts repeatedly reiterate the need to heal the party immediately or escape the current dungeon.'
This behaviour didn't go unnoticed. Viewers on Twitch began identifying when the AI was panicking, with DeepMind noting, 'This behaviour has occurred in enough separate instances that the members of the Twitch chat have actively noticed when it is occurring.'
Although AI doesn't experience stress or emotion like humans, the model's erratic decision-making in high-pressure situations mirrors how people behave under stress, making impulsive or inefficient choices.
In the first full game run, Gemini took 813 hours to finish Pokémon Blue. After adjustments by Zhang, the AI completed a second playthrough in 406.5 hours. Still, this was far from efficient, especially compared to the time a child would take to complete the same game.
Social media users were quick to mock the AI's anxious gameplay. 'If you read it's thoughts when reasoning it seems to panic just about any time you word something slightly off,' said one viewer. Another joked: 'LLANXIETY.'
A third chimed in with a broader reflection: 'I'm starting to think the 'Pokémon index' might be one of our best indicators of AGI. Our best AIs still struggling with a child's game is one of the best indicators we have of how far we still have yet to go. And how far we've come.'
Interestingly, these revelations come just weeks after Apple released a study arguing that most AI reasoning models don't truly reason at all. Instead, they rely heavily on pattern recognition and tend to fall apart when the task is tweaked or made more complex.
Also read: Two fired after Michigan man receives $1.6 million salary in major payroll slip-up - Sakshi
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles


Deccan Herald
18 minutes ago
- Deccan Herald
Samsung to launch Galaxy M36 5G next week in India
Thanks to deeper collaboration with Google and Samsung, the Galaxy M36 will support advanced versions of Gemini AI features. It will be priced under Rs 20,000 in India.
&w=3840&q=100)

First Post
31 minutes ago
- First Post
16 billion passwords compromised, says report; have you changed yours?
A massive breach has exposed over 16 billion usernames and passwords from platforms like Google, Apple, Facebook, and more. The leak raises serious cybersecurity concerns, prompting urgent calls for stronger passwords, two-factor authentication, and regular dark web exposure checks. read more A staggering 16 billion usernames and passwords have been exposed in what experts are calling the largest-ever database of stolen credentials. The trove of compromised data includes login details from major platforms such as Apple, Google, Facebook, Telegram, GitHub and even government services, raising alarms over the global state of digital security. Cybersecurity researchers say the breach stems from a collection of 30 massive datasets, each holding tens of millions to over 3.5 billion records. The information, mostly acquired through infostealing malware, appears to be freshly leaked, with nearly all of the datasets previously unreported except for one earlier disclosure of 184 million passwords by researcher Jeremiah Fowler, according to a new investigation by Cybernews. STORY CONTINUES BELOW THIS AD 'Most of these credentials are structured as URLs followed by usernames and passwords, and they cover virtually every type of online service imaginable,' said Vilius Petkauskas, a Cybernews analyst who has been investigating the leak since the beginning of the year. The scale of this breach surpasses previous incidents, including last year's so-called 'Mother of All Breaches' which exposed 26 billion records. While it's unclear whether some of the leaked data might have been repackaged from earlier incidents, researchers insist that this leak is largely new. Lawrence Pingree, vice president at cybersecurity firm Dispersive, explained that such datasets are often circulated and resold on the dark web—sometimes bundled with other leaks, sometimes offered piecemeal. 'Whether it's a repackaged leak or not, 16 billion records is a huge number,' Pingree said. 'This kind of data is valuable precisely because it is so often misused.' The breach underscores how widespread the threat of credential theft has become, with attackers targeting social media platforms, corporate portals, developer tools, and VPN services alike. In response, experts urge users to adopt better security hygiene. Basic protections include running antivirus scans to detect infostealers, checking dark web exposure via tools like Google One's 'Dark Web Report,' and crucially, using strong and unique passwords for every service.

Mint
an hour ago
- Mint
Alphabet's Google tries to appease EU with changes to search result rankings: Report
Alphabet's Google has reportedly put forward additional concessions to address concerns raised by European Union regulators, aiming to avoid a significant antitrust fine under the bloc's new digital competition rules. According to documents reviewed byReuters, the U.S. tech giant is attempting to appease the European Commission by adjusting how rival services are displayed in its search results. The move comes in response to formal charges filed three months ago, accusing Google of favouring its own platforms, such as Google Shopping, Hotels, and Flights at the expense of competitors, in violation of the Digital Markets Act (DMA). You may be interested in The DMA, which came into force earlier this year, outlines strict obligations for so-called 'gatekeeper' platforms to curb anti-competitive behaviour and offer consumers broader choices. Under Google's revised proposal, a selected vertical search service (VSS), chosen based on objective and non-discriminatory criteria would be prominently featured in its own dedicated box at the top of the results page. This box would mirror the design and features of Google's own modules and contain three direct links to offerings in categories like hotels, restaurants, transport, and airlines. Other VSS providers would still be listed further down in the search results, but would not benefit from a similarly prominent display unless users click to access them. Despite the proposal, the company has maintained that it disagrees with the Commission's preliminary conclusions. 'We do not agree with the (Commission's) preliminary findings' position but, on a without prejudice basis, we want to find a workable solution to resolve the present proceedings,' the documents noted. The European Commission has scheduled a meeting on 8 July to gather feedback from competing firms. Several of Google's rivals, who declined to be named ahead of the discussion, expressed scepticism over the effectiveness of the proposed changes. They argue that the measures still fall short of delivering a genuinely level playing field. The outcome of these negotiations could set a crucial precedent for how Big Tech operates under the EU's ambitious digital regulatory framework. (With inputs from Reuters)