06-08-2025
Google DeepMind unveils Genie 3 AI world model: What is it, how it works
Google DeepMind has unveiled Genie 3, the latest version of its AI world model that can generate explorable 3D environments in real time from a simple text prompt. Unlike earlier versions, Genie 3 supports continuous interaction for a few minutes, remembers where objects were placed, and allows dynamic changes like adding characters or altering weather conditions.
According to Google DeepMind, Genie 3 is available as a limited research preview to select academics and creators, with plans to expand access gradually.
What is Genie 3?
Genie 3 is part of a class of AI systems called world models, which simulate dynamic environments instead of generating static content. These models can be used in everything from education and training simulations to robotics and video games.
The idea is to give the model a prompt — say, 'a forest during a thunderstorm' — and have it generate a playable 3D space that you can explore using basic movement controls.
What can Genie 3 do?
Explore in real time: According to Google DeepMind, Genie 3 allows users to move through these environments at 24 frames per second in 720p resolution, with consistency retained for a few minutes. That's a notable leap from Genie 2, where interaction was limited to ten to twenty seconds, as reported by The Verge.
Remembers what you saw: One of Genie 3's biggest upgrades is visual memory. If you leave an object behind and come back later, it will still be there — a capability missing from most previous world models. Google says this visual memory can persist for about a minute.
Trigger real-world events: As per the DeepMind blog, Genie 3 supports 'promptable world events,' which means users can make changes like adding rain, introducing characters, or altering objects just by typing new instructions. These changes happen in real time, expanding the possible use cases beyond navigation.
How is this different from earlier models?
Previous world models like Genie 2 were limited in both realism and duration. Genie 3 introduces two significant technical improvements:
Frame-by-frame generation with memory tracking, which allows consistency over longer periods.
Dynamic generation without needing a 3D scene or preset assets, unlike methods like NeRFs or Gaussian Splatting that require defined geometry.
This makes Genie 3 more flexible for research and development, especially for training AI agents to perform tasks across longer timelines.
What are its limitations?
Despite the progress, Genie 3 has several limitations:
It cannot simulate real-world locations with geographic accuracy.
Legible text is often only present if it's part of the original prompt.
The range of interactions is limited, and multi-agent interactions remain under development.
While more stable, it still only supports a few minutes of continuous exploration.
Google DeepMind acknowledges that the technology raises new safety and responsibility challenges, which is why Genie 3 is being rolled out gradually.