Latest news with #Gemini_Plays_Pokemon

Hindustan Times
11 hours ago
- Hindustan Times
Google's Gemini AI panics while playing Pokémon, takes 800 hours to finish game
Artificial intelligence has made remarkable strides, but Google's latest chatbot is showing that even the smartest machines can crumble under pressure. A recent report by Google DeepMind reveals that its flagship model, Gemini 2.5 Pro, displayed signs of panic while playing Pokémon Blue—an old-school video game many children breeze through with ease. The findings came from a Twitch channel called Gemini_Plays_Pokemon, where independent engineer Joel Zhang put Gemini to the test. While Gemini is known for its advanced reasoning abilities and code-level understanding, its performance during this gaming challenge exposed unexpected behavioural quirks. Also read: 40-year-old man dies of cancer after doctors told him stomach ache was due to stress According to the DeepMind team, Gemini began to exhibit what they describe as 'Agent Panic.' The report states, 'Over the course of the playthrough Gemini 2.5 Pro gets into various situations which cause the model to simulate 'panic'. For example, when the Pokémon in the party's health or power points are low, the model's thoughts repeatedly reiterate the need to heal the party immediately or escape the current dungeon.' This behaviour didn't go unnoticed. Viewers on Twitch began identifying when the AI was panicking, with DeepMind noting, 'This behaviour has occurred in enough separate instances that the members of the Twitch chat have actively noticed when it is occurring.' Although AI doesn't experience stress or emotion like humans, the model's erratic decision-making in high-pressure situations mirrors how people behave under stress, making impulsive or inefficient choices. In the first full game run, Gemini took 813 hours to finish Pokémon Blue. After adjustments by Zhang, the AI completed a second playthrough in 406.5 hours. Still, this was far from efficient, especially compared to the time a child would take to complete the same game. Social media users were quick to mock the AI's anxious gameplay. 'If you read it's thoughts when reasoning it seems to panic just about any time you word something slightly off,' said one viewer. Another joked: 'LLANXIETY.' A third chimed in with a broader reflection: 'I'm starting to think the 'Pokémon index' might be one of our best indicators of AGI. Our best AIs still struggling with a child's game is one of the best indicators we have of how far we still have yet to go. And how far we've come.' Interestingly, these revelations come just weeks after Apple released a study arguing that most AI reasoning models don't truly reason at all. Instead, they rely heavily on pattern recognition and tend to fall apart when the task is tweaked or made more complex. Also read: Two fired after Michigan man receives $1.6 million salary in major payroll slip-up - Sakshi


NDTV
15 hours ago
- NDTV
Google's AI Chatbot Panics When Playing Video Game Meant For Children
Artificial intelligence (AI) chatbots might be smart, but they still sweat bullets while playing video games that seemingly young kids are able to ace. A new Google DeepMind report has found that its Gemini 2.5 Pro resorts to panic when playing Pokemon, especially when one of the fictional characters is close to death, causing the AI's performance to experience qualitative degradation in the model's reasoning capability. Google highlighted a case study from a Twitch channel named Gemini_Plays_Pokemon, where Joel Zhang, an engineer unaffiliated with the tech company, plays Pokemon Blue using Gemini. During the two playthroughs, the Gemini team at DeepMind observed an interesting phenomenon they describe as 'Agent Panic'. "Over the course of the playthrough, Gemini 2.5 Pro gets into various situations which cause the model to simulate "panic". For example, when the Pokemon in the party's health or power points are low, the model's thoughts repeatedly reiterate the need to heal the party immediately or escape the current dungeon," the report highlighted. "This behavior has occurred in enough separate instances that the members of the Twitch chat have actively noticed when it is occurring," the report says. While AI models are trained on copious amounts of data and do not think or experience emotions like humans, their actions mimic the way in which a person might make poor, hasty decisions when under stress. In the first playthrough, the AI agent took 813 hours to finish the game. After some tweaking by Mr Zhang, the AG agent shaved some hundreds of hours and finished the game in 406.5 hours. While the progress was impressive, the AI agent was still not good at playing Pokémon. It took Gemini hundreds of hours to reason through a game that a child could complete in significantly less time. The chatbot displayed erratic behaviour despite Gemini 2.5 Pro being Google's most intelligent thinking model that exhibits strong reasoning and codebase-level understanding, whilst producing interactive web applications. Social media reacts Reacting to Gemini's panicky nature, social media users said such games could be the benchmark for the real thinking skills of the AI tools. "If you read its thoughts when reasoning it seems to panic just about any time you word something slightly off," said one user, while another added: "LLANXIETY." A third commented: "I'm starting to think the 'Pokemon index' might be one of our best indicators of AGI. Our best AIs still struggling with a child's game is one of the best indicators we have of how far we still have yet to go. And how far we've come." Earlier this month, Apple released a new study, claiming that most reasoning models do not reason at all, albeit they simply memorise patterns really well. However, when questions are altered or the complexity is increased, they collapse altogether.