Latest news with #WorldLabs

Business Insider
a day ago
- Business
- Business Insider
Top AI researchers say language is limiting. Here's the new kind of model they are building instead.
As OpenAI, Anthropic, and Big Tech invest billions in developing state-of-the-art large-language models, a small group of AI researchers is working on the next big thing. Computer scientists like Fei-Fei Li, the Stanford professor famous for inventing ImageNet, and Yann LeCun, Meta's chief AI scientist, are building what they call "world models." Unlike large-language models, which determine outputs based on statistical relationships between the words and phrases in their training data, world models predict events based on the mental constructs that humans make of the world around them. "Language doesn't exist in nature," Li said on a recent episode of Andreessen Horowitz's a16z podcast. "Humans," she said, "not only do we survive, live, and work, but we build civilization beyond language." Computer scientist and MIT professor, Jay Wright Forrester, in his 1971 paper "Counterintuitive Behavior of Social Systems," explained why mental models are crucial to human behavior: Each of us uses models constantly. Every person in private life and in business instinctively uses models for decision making. The mental images in one's head about one's surroundings are models. One's head does not contain real families, businesses, cities, governments, or countries. One uses selected concepts and relationships to represent real systems. A mental image is a model. All decisions are taken on the basis of models. All laws are passed on the basis of models. All executive actions are taken on the basis of models. The question is not to use or ignore models. The question is only a choice among alternative models. If AI is to meet or surpass human intelligence, then the researchers behind it believe it should be able to make mental models, too. Li has been working on this through World Labs, which she cofounded in 2024 with an initial backing of $230 million from venture firms like Andreessen Horowitz, New Enterprise Associates, and Radical Ventures. "We aim to lift AI models from the 2D plane of pixels to full 3D worlds — both virtual and real — endowing them with spatial intelligence as rich as our own," World Labs says on its website. Li said on the No Priors podcast that spatial intelligence is "the ability to understand, reason, interact, and generate 3D worlds," given that the world is fundamentally three-dimensional. Li said she sees applications for world models in creative fields, robotics, or any area that warrants infinite universes. Like Meta, Anduril, and other Silicon Valley heavyweights, that could mean advances in military applications by helping those on the battlefield better perceive their surroundings and anticipate their enemies' next moves. The challenge of building world models is the paucity of sufficient data. In contrast to language, which humans have refined and documented over centuries, spatial intelligence is less developed. "If I ask you to close your eyes right now and draw out or build a 3D model of the environment around you, it's not that easy," she said on the No Priors podcast. "We don't have that much capability to generate extremely complicated models till we get trained." To gather the data necessary for these models, "we require more and more sophisticated data engineering, data acquisition, data processing, and data synthesis," she said. That makes the challenge of building a believable world even greater. At Meta, chief AI scientist Yann LeCun has a small team dedicated to a similar project. The team uses video data to train models and runs simulations that abstract the videos at different levels. "The basic idea is that you don't predict at the pixel level. You train a system to run an abstract representation of the video so that you can make predictions in that abstract representation, and hopefully this representation will eliminate all the details that cannot be predicted," he said at the AI Action Summit in Paris earlier this year. That creates a simpler set of building blocks for mapping out trajectories for how the world will change at a particular time. LeCun, like Li, believes these models are the only way to create truly intelligent AI. "We need AI systems that can learn new tasks really quickly," he said recently at the National University of Singapore. "They need to understand the physical world — not just text and language but the real world — have some level of common sense, and abilities to reason and plan, have persistent memory — all the stuff that we expect from intelligent entities."


Forbes
3 days ago
- Entertainment
- Forbes
World Labs Ups The Ante In World Generation
Red cup Mushroom, Wild Fly Agaric Amanita Muscaria toxic with poison and hallucinogenic properties. ... More Autumn season in the forest with orange, red and brown leaves from the trees and saturated fall nature colors in Stramproy area in the province of Limburg in the Netherlands, near the Belgian borders on October 24, 2020. (Photo by Nicolas Economou/NurPhoto via Getty Images) A majestic red and black checkerboard pathway, framed by overgrown shrubbery. A manorial structure with bright red windows, on a craggy cliff at the end of a long and winding road. A close pathway through a proliferation of neon mushrooms. What does all of this have in common? It's been generated by groundbreaking technology from World Labs, where the model can extrapolate an entire world based on one single image. It's important that people understand the technology at work here. Just a couple of decades ago, our state-of-the-art technique was to combine thousands of images into a frame-by-frame virtual exploration of a digital space. People used this technology to sell homes with virtual real estate tours, and for other kinds of use cases. This new thing is quite different. The technology is understanding how light and shadow affect surfaces. It's understanding how structures change when viewed from different angles. And most importantly, it's dreaming up a world of color and light, the likes of which we've never seen! World Labs, by the way, is a new player in the industry. It was created by Fei-Fei Li of Stanford fame just this year. FEATURED | Frase ByForbes™ Unscramble The Anagram To Reveal The Phrase Pinpoint By Linkedin Guess The Category Queens By Linkedin Crown Each Region Crossclimb By Linkedin Unlock A Trivia Ladder One of the most prominent points of these public demos that I've been looking at online is that they're bright, bold and beautiful. These larger-than-life scenes are rendered capably in 3-D, and you can move around a little bit, although billing it as a digital 'world' is a little bit misleading. It's not hard to generate an out of bounds error, just by taking a few steps forward. On a side note, the page also caused my browser to crash more than once. Having said all that, this is extremely illuminating, some examples of capability far beyond what we would have thought possible just a few years ago. The closest thing we have to compare it to is the rendered game environments that were until quite recently mostly hand-coded. A Techcrunch article compares the World Labs project to Minecraft, which might appeal to a ten year old player more than a VC. But the ramifications of this are evident, or at least they should be. This isn't just a kid's game. Recent applications like a 'dress-anybody' garment simulator are just the beginning. Further down in the same demo page, you have a presentation attributed to Brittani Natali, who apparently put together a short film using different elements of today's new cutting-edge tools. The full stack included the World Labs generator as well as ElevenLabs, presumably for speech, and other tools like Suno and Blender and CapCut. On another side note, it seems that in addition to a dearth of information for the creator (Brittani Natali) in a google search, ChatGPT was unable to locate any info on this person either. It seems the individual has a pretty thin web footprint! Anyway, as for the film, the result is striking - the viewer of the film experiences walking through a good number of these digital environments quickly, and then centers in on a dilapidated house that's empty and abandoned. Our love affair with abandoned spaces has always been there in the human imagination. We love to explore – and see inside something that no one has seen for years. But what if we're seeing inside something that nobody has ever seen in the history of humanity? These colorful liminal spaces aren't just abandoned – they're brand new. This is the frontier of our world where we set out through uncharted waters to see what lies ahead. Is it scary? Is it exciting? You decide. But the bottom line is that it's here, and it's going to start popping up in unexpected places. I've covered a lot of big headlines this month, from corporate strategies pivoting quickly, to hardware wars that are determining who will benefit from the next generation of models. But it's also very interesting to keep an eye on the leapfrogging that AI is doing in multimedia – from Stable Diffusion and Dall-E, to Sora, to this new thing - where pretty soon, metaverse environments are going to create themselves.

Yahoo
28-05-2025
- Business
- Yahoo
Odyssey's new AI model streams 3D interactive worlds
Odyssey, a startup founded by self-driving pioneers Oliver Cameron and Jeff Hawke, has developed an AI model that lets users "interact" with streaming video. Available on the web in an "early demo," the model generates and streams video frames every 40 milliseconds. Via basic controls, viewers can explore areas within a video, similar to a 3D-rendered video game. "Given the current state of the world, an incoming action, and a history of states and actions, the model attempts to predict the next state of the world," explains Odyssey in a blog post. "Powering this is a new world model, demonstrating capabilities like generating pixels that feel realistic, maintaining spatial consistency, learning actions from video, and outputting coherent video streams for 5 minutes or more." A number of startups and big tech companies are chasing after world models, including DeepMind, influential AI researcher Fei-Fei Lee's World Labs, Microsoft, and Decart. They believe that world models could one day be used to create interactive media, such as games and movies, and run realistic simulations like training environments for robots. But creatives have mixed feelings about the tech. A recent Wired investigation found that game studios like Activision Blizzard, which has laid off scores of workers, are using AI to cut corners and combat attrition. And a 2024 study commissioned by the Animation Guild, a union representing Hollywood animators and cartoonists, estimated that over 100,000 U.S.-based film, television, and animation jobs will be disrupted by AI in the coming months. For its part, Odyssey is pledging to collaborate with creative professionals — not replace them. "Interactive video [...] opens the door to entirely new forms of entertainment, where stories can be generated and explored on demand, free from the constraints and costs of traditional production," writes the company in its blog post. "Over time, we believe everything that is video today — entertainment, ads, education, training, travel, and more — will evolve into interactive video, all powered by Odyssey." Odyssey's demo is a bit rough around the edges, which the company acknowledges in its post. The environments the model generates are blurry and distorted, and unstable in the sense that their layouts don't always remain the same. Walk forward in one direction for a while or turn around, and the surroundings might suddenly look different. But the company's promising to rapidly improve upon the model, which can currently stream video at up to 30 frames per second from clusters of Nvidia H100 GPUs at the cost of $1-$2 per "user-hour." "Looking ahead, we're researching richer world representations that capture dynamics far more faithfully, while increasing temporal stability and persistent state," writes Odyssey in its post. "In parallel, we're expanding the action space from motion to world interaction, learning open actions from large-scale video." Odyssey is taking a different approach than many AI labs in the world modeling space. It designed a 360-degree, backpack-mounted camera system to capture real-world landscapes, which Odyssey thinks can serve as a basis for higher-quality models than models trained solely on publicly available data. To date, Odyssey has raised $27 million from investors including EQT Ventures, GV, and Air Street Capital. Ed Catmull, one of the co-founders of Pixar and former president of Walt Disney Animation Studios, is on the startup's board of directors. Last December, Odyssey said it was working on software that allows creators to load scenes generated by its models into tools such as Unreal Engine, Blender, and Adobe After Effects so that they can be hand-edited. This article originally appeared on TechCrunch at Sign in to access your portfolio
Yahoo
28-05-2025
- Business
- Yahoo
Odyssey's new AI model streams 3D interactive worlds
Odyssey, a startup founded by self-driving pioneers Oliver Cameron and Jeff Hawke, has developed an AI model that lets users "interact" with streaming video. Available on the web in an "early demo," the model generates and streams video frames every 40 milliseconds. Via basic controls, viewers can explore areas within a video, similar to a 3D-rendered video game. "Given the current state of the world, an incoming action, and a history of states and actions, the model attempts to predict the next state of the world," explains Odyssey in a blog post. "Powering this is a new world model, demonstrating capabilities like generating pixels that feel realistic, maintaining spatial consistency, learning actions from video, and outputting coherent video streams for 5 minutes or more." A number of startups and big tech companies are chasing after world models, including DeepMind, influential AI researcher Fei-Fei Lee's World Labs, Microsoft, and Decart. They believe that world models could one day be used to create interactive media, such as games and movies, and run realistic simulations like training environments for robots. But creatives have mixed feelings about the tech. A recent Wired investigation found that game studios like Activision Blizzard, which has laid off scores of workers, are using AI to cut corners and combat attrition. And a 2024 study commissioned by the Animation Guild, a union representing Hollywood animators and cartoonists, estimated that over 100,000 U.S.-based film, television, and animation jobs will be disrupted by AI in the coming months. For its part, Odyssey is pledging to collaborate with creative professionals — not replace them. "Interactive video [...] opens the door to entirely new forms of entertainment, where stories can be generated and explored on demand, free from the constraints and costs of traditional production," writes the company in its blog post. "Over time, we believe everything that is video today — entertainment, ads, education, training, travel, and more — will evolve into interactive video, all powered by Odyssey." Odyssey's demo is a bit rough around the edges, which the company acknowledges in its post. The environments the model generates are blurry and distorted, and unstable in the sense that their layouts don't always remain the same. Walk forward in one direction for a while or turn around, and the surroundings might suddenly look different. But the company's promising to rapidly improve upon the model, which can currently stream video at up to 30 frames per second from clusters of Nvidia H100 GPUs at the cost of $1-$2 per "user-hour." "Looking ahead, we're researching richer world representations that capture dynamics far more faithfully, while increasing temporal stability and persistent state," writes Odyssey in its post. "In parallel, we're expanding the action space from motion to world interaction, learning open actions from large-scale video." Odyssey is taking a different approach than many AI labs in the world modeling space. It designed a 360-degree, backpack-mounted camera system to capture real-world landscapes, which Odyssey thinks can serve as a basis for higher-quality models than models trained solely on publicly available data. To date, Odyssey has raised $27 million from investors including EQT Ventures, GV, and Air Street Capital. Ed Catmull, one of the co-founders of Pixar and former president of Walt Disney Animation Studios, is on the startup's board of directors. Last December, Odyssey said it was working on software that allows creators to load scenes generated by its models into tools such as Unreal Engine, Blender, and Adobe After Effects so that they can be hand-edited. Error while retrieving data Sign in to access your portfolio Error while retrieving data
Yahoo
22-05-2025
- Business
- Yahoo
Fei-Fei Li, 'godmother of AI,' points to risks of cuts to US research funds, student visas
Fei-Fei Li, the leading Stanford University researcher nicknamed the 'godmother of AI,' emphasized the risks of cutting research funding and international student visas to the US as it faces an increasingly competitive global tech race. 'The public sector, especially higher education, has been a pivotal, central part of America's innovation ecosystem [and] a critical part of our economic growth,' she said at a Semafor Tech event in San Francisco on Wednesday. 'Almost everything we know as classic knowledge of AI came from academic research, whether its algorithms, data-driven methods, or early research in microprocessors.' Her comments come as the Trump administration has targeted billions of dollars in grants for universities because of their social policies and revoked thousands of visas for students as part of its crackdown on immigration. Li, who is co-director of the Stanford Institute Human-Centered AI and the co-founder of the World Labs startup, is revered in the tech sector because of her early breakthroughs in artificial intelligence, and she's been a leading voice on policies to promote innovation. 'Continuing to nourish our higher education, our public sector, for this kind of innovative, blue sky, curiosity-driven research is critical for the health of our ecosystem [and] for training the next generation,' she said. 'I hope [my students] can get work visas and have a path for immigration.' She added that US visa quotas for individuals from certain countries 'has been a challenge for many talents to stay, and I hope we can continue to address that.' Speaking at the same event on Wednesday, Anthropic co-founder Jack Clark added that the Trump administration recognizes the importance of AI, but its strategy is still coming together, including its plan around export controls. 'The challenge of AI is it touches everything — electricity generation, export controls, how you do the structuring of science and research,' Clark said. 'This administration has gone through the pattern of coming and saying, 'Oh wow, we've got this cool technology. It's called AI.' And then they discover very quickly that this technology touches every single thing they care about. It will eventually form into — I hope — a larger, more coherent strategy.' Clark also noted the increasing tension reflected in the Trump administration's recent decision to expand access to advanced chips to the UAE and Saudi Arabia. The region had been limited under the Biden administration, which put caps on its semiconductor purchases from American companies. On one hand, Clark said, there's a push for the world to use US-led AI platforms and build on US chips. But if the technology becomes as incredibly powerful as predicted, products built on it will have dual uses that could have nefarious purposes. 'All of those computers all around the world are going to become equivalent to factories that can turn out both cars and tanks,' he said. 'And you are going to see which way it goes depending on what the alliance looks like. So we are going to need to think about how we apply really good security to these factories that we've now built all around the world.' Director of the Center for AI Safety Dan Hendrycks, who is also an advisor at Elon Musk's AI startup, xAI, and for Scale AI, also pointed to export controls as a safety mechanism. 'Chips are sort of the currency of power in AI. It's over 90% of these AI companies' budget is spent on,' Hendrycks said at the event. 'And making sure that that isn't falling in the hands of rogue actors, or pariah states, is a priority.'