Latest news with #PlaysPokemon

Google's Gemini chatbot may have a Pokemon game 'problem'

Time of India

10 hours ago

Entertainment
Time of India

Google's Gemini chatbot may have a Pokemon game 'problem'

Google's Gemini and other AI chatbots may have a "problem". A new research indicates that these AI models can exhibit irregular behaviour, like "panic," when confronted with challenges in Pokemon games. Tired of too many ads? go ad free now According to a report by DeepMind, Gemini 2.5 Pro experiences "qualitatively observable degradation in the model's reasoning capability" when its Pokemon are close to defeat. This observation comes as AI companies like Google and Anthropic are studying how their latest AI models navigate early Pokemon games. Researchers believe that observing AI models playing video games can provide useful insights into their capabilities. How Google's Gemini and other chatbots reacted to older Pokemon games In recent months, independent developers have launched Twitch streams like 'Gemini Plays Pokemon' and 'Claude Plays Pokemon,' showcasing AI models playing the classic game in real time, the report mentions. This offers an alternative, more contextual way to benchmark AI performance beyond traditional testing methods. Each stream reveals how the AI reasons through problems, offering insight into its decision-making process. While these models have advanced quickly, they still struggle with tasks like playing Pokemon efficiently, often taking hundreds of hours to finish. The real intrigue lies in observing the AI's behaviour and choices during gameplay, rather than its speed. 'Over the course of the playthrough, Gemini 2.5 Pro gets into various situations which cause the model to simulate 'panic,'' the report noted. This means, during gameplay, the AI can enter a state of 'panic,' where its performance declines and it stops using some available tools. Tired of too many ads? go ad free now While the AI doesn't experience emotions, this behaviour resembles how humans may make poor decisions under stress, making it both intriguing and unsettling. 'This behaviour has occurred in enough separate instances that the members of the Twitch chat have actively noticed when it is occurring,' the report added. Apart from Gemini, Claude has also shown unusual behaviour during the gameplay. At one point, it wrongly assumed that fainting all its Pokemon would transport it forward in the game, leading it to intentionally lose battles, which is a strategy that backfired, as it was sent to the last used Pokemon Center instead. Despite such errors, the AI has excelled at solving in-game puzzles. With some human help, it used task-specific AI tools to navigate boulder puzzles and plan efficient routes. 'With only a prompt describing boulder physics and a description of how to verify a valid path, Gemini 2.5 Pro is able to one-shot some of these complex boulder puzzles, which are required to progress through Victory Road,' t he report highlighted. The report also suggests that since Gemini 2.5 Pro independently created many of its tools, upcoming versions may be able to do so without human help, potentially even developing a 'don't panic' module on its own.

Latest news with #PlaysPokemon

Google's Gemini chatbot may have a Pokemon game 'problem'

Get Started Now: Download the App