logo
#

Latest news with #PokémonBlue

Google's Gemini AI panics while playing Pokémon, takes 800 hours to finish game
Google's Gemini AI panics while playing Pokémon, takes 800 hours to finish game

Hindustan Times

time11 hours ago

  • Hindustan Times

Google's Gemini AI panics while playing Pokémon, takes 800 hours to finish game

Artificial intelligence has made remarkable strides, but Google's latest chatbot is showing that even the smartest machines can crumble under pressure. A recent report by Google DeepMind reveals that its flagship model, Gemini 2.5 Pro, displayed signs of panic while playing Pokémon Blue—an old-school video game many children breeze through with ease. The findings came from a Twitch channel called Gemini_Plays_Pokemon, where independent engineer Joel Zhang put Gemini to the test. While Gemini is known for its advanced reasoning abilities and code-level understanding, its performance during this gaming challenge exposed unexpected behavioural quirks. Also read: 40-year-old man dies of cancer after doctors told him stomach ache was due to stress According to the DeepMind team, Gemini began to exhibit what they describe as 'Agent Panic.' The report states, 'Over the course of the playthrough Gemini 2.5 Pro gets into various situations which cause the model to simulate 'panic'. For example, when the Pokémon in the party's health or power points are low, the model's thoughts repeatedly reiterate the need to heal the party immediately or escape the current dungeon.' This behaviour didn't go unnoticed. Viewers on Twitch began identifying when the AI was panicking, with DeepMind noting, 'This behaviour has occurred in enough separate instances that the members of the Twitch chat have actively noticed when it is occurring.' Although AI doesn't experience stress or emotion like humans, the model's erratic decision-making in high-pressure situations mirrors how people behave under stress, making impulsive or inefficient choices. In the first full game run, Gemini took 813 hours to finish Pokémon Blue. After adjustments by Zhang, the AI completed a second playthrough in 406.5 hours. Still, this was far from efficient, especially compared to the time a child would take to complete the same game. Social media users were quick to mock the AI's anxious gameplay. 'If you read it's thoughts when reasoning it seems to panic just about any time you word something slightly off,' said one viewer. Another joked: 'LLANXIETY.' A third chimed in with a broader reflection: 'I'm starting to think the 'Pokémon index' might be one of our best indicators of AGI. Our best AIs still struggling with a child's game is one of the best indicators we have of how far we still have yet to go. And how far we've come.' Interestingly, these revelations come just weeks after Apple released a study arguing that most AI reasoning models don't truly reason at all. Instead, they rely heavily on pattern recognition and tend to fall apart when the task is tweaked or made more complex. Also read: Two fired after Michigan man receives $1.6 million salary in major payroll slip-up - Sakshi

Google's AI model beats 29-year-old video game, CEO Sundar Pichai says: What ...
Google's AI model beats 29-year-old video game, CEO Sundar Pichai says: What ...

Time of India

time04-05-2025

  • Entertainment
  • Time of India

Google's AI model beats 29-year-old video game, CEO Sundar Pichai says: What ...

Google's advanced AI, Gemini 2.5 Pro, has conquered the classic 1996 GameBoy game Pokémon Blue , marking a significant achievement. In a celebratory post on X, Google CEO Sundar Pichai announced, "What a finish! Gemini 2.5 Pro just completed Pokémon Blue! Special thanks to @TheCodeOfJoel for creating and running the livestream, and to everyone who cheered Gem on along the way." — sundarpichai (@sundarpichai) by Taboola by Taboola Sponsored Links Sponsored Links Promoted Links Promoted Links You May Like Click Here To Read More - Call Recording System Expertinspector Click Here Undo The feat was showcased through the Gemini Plays Pokémon livestream, run by Joel Z, a 30-year-old software engineer not affiliated with Google. Google executives have been vocal supporters of the project, with Logan Kilpatrick, Google AI Studio's product lead, noting last month that Gemini had secured its fifth gym badge, outpacing competing models. The focus on Pokémon stems from a broader AI challenge. In February, Anthropic showcased its Claude AI's progress in Pokémon Red, a sibling game to Pokémon Blue, emphasizing Claude's ability to handle complex tasks through enhanced training. This inspired Joel Z's Gemini project, though he cautions against direct comparisons, as Gemini and Claude rely on different tools and inputs. To navigate the game, Gemini uses an 'agent harness' that processes game screenshots with overlaid data, enabling the AI to make decisions and issue commands. Joel Z admitted to providing minor interventions to refine Gemini's reasoning, such as clarifying a game mechanic involving a Rocket Grunt, but stressed these were not explicit hints or cheating. 'Gemini Plays Pokémon is a work in progress,' Joel Z explained, noting ongoing improvements to the system. Meanwhile, Anthropic's Claude has yet to complete Pokémon Red, leaving Gemini's success as a notable milestone in AI gaming prowess.

Google's Gemini has beaten Pokémon Blue (with a little help)
Google's Gemini has beaten Pokémon Blue (with a little help)

Yahoo

time03-05-2025

  • Entertainment
  • Yahoo

Google's Gemini has beaten Pokémon Blue (with a little help)

Google's most expensive AI model seems to have crossed a major milestone: Beating a 29-year-old video game. Last night, Google CEO Sundar Pichai posted triumphantly on X, 'What a finish! Gemini 2.5 Pro just completed Pokémon Blue!' To be clear, the Gemini Plays Pokemon livestream was created by (in his own words) 'a 30 year old software engineer unaffiliated with Google' who goes by Joel Z. But Google executives have been cheering the effort on. For example, Logan Kilpatrick, the product lead for Google AI Studio, posted last month that Gemini was 'making great progress at completing Pokémon' and had 'earned its 5th badge (next best model only has 3 so far, though with a different agent harness),' leading Pichai to joke, 'We are working on API, Artificial Pokémon Intelligence:)' Why Pokémon? Back in February, Anthropic highlighted progress that its Claude AI models were making in 'Pokémon Red,' writing that Claude's 'extended thinking and agent training' gives it 'a major boost' on 'more unexpected' tasks, like playing a classic game. ('Pokémon Red' and 'Blue' are different versions of a GameBoy title first released in 1996 and tied to the long-running Pokémon franchise). There's even a Claude Plays Pokemon Twitch channel that Joel Z cited as an inspiration. Despite its progress, Claude does not appear to have beaten 'Pokémon Red' yet. Does that mean Gemini is objectively better at the game? On his Twitch page, Joel Z urged viewers, 'Please don't consider this a benchmark for how well an LLM can play Pokemon. You can't really make direct comparisons — Gemini and Claude have different tools and receive different information.' And both AI models need help to play the game — that's where the aforementioned agent harnesses come in, providing the models with game screenshots overlaid with additional information, allowing the model to decide how to respond (which may involve calling specialized agents), and then pressing the button that corresponds with the AI's instruction. Joel Z acknowledged that there were other 'dev interventions' to help Gemini complete the game, but insisted that it's not cheating. 'My interventions improve Gemini's overall decision-making and reasoning abilities,' he says. 'I don't give specific hints — there are no walkthroughs or direct instructions for particular challenges like Mt. Moon. The only thing that comes even close is letting Gemini know that it needs to talk to a Rocket Grunt twice to obtain the Lift Key, which was a bug that was later fixed in Pokemon Yellow.' Plus, he said, "Gemini Plays Pokémon is still actively being developed, and the framework continues to evolve." Sign in to access your portfolio

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store