Latest news with #Ai2

Ai2 unveils MolmoAct: Open-source robotics system reasons in 3D and adjusts on the fly

Geek Wire

2 days ago

Business
Geek Wire

Ai2 unveils MolmoAct: Open-source robotics system reasons in 3D and adjusts on the fly

Jiafei Duan, Ai2 researcher, shows MolmoAct controlling a robotic arm. (GeekWire Photo / Todd Bishop) The Allen Institute for AI released a new AI robotics system that uses novel approaches to help robots navigate messy real-world environments, while making all of the model's code, data, and training methods publicly available under open-source principles. The system, called MolmoAct, converts 2D images into 3D visualizations, previews its movements before acting, and lets human operators adjust those actions in real time. It differs from existing robotics models that often work as opaque black boxes, trained on proprietary datasets. Ai2 expects the system to be used by robotics researchers, companies, and developers as a foundation for building robots that can operate in unstructured environments such as homes, warehouses, and disaster response scenes. In demos last week at Ai2's new headquarters north of Seattle's Lake Union, researchers showed MolmoAct interpreting natural language commands to direct a robotic arm to pick up household objects, such as cups and plush toys, and move them to specific locations. Researchers described it as part of AI2's broader efforts to create a comprehensive set of open-source AI tools and technologies. The Seattle-based research institute was founded in 2014 by the late Microsoft co-founder Paul Allen, and is funded in part by his estate. Ai2's flagship OLMo large language model is a fully transparent alternative to proprietary systems, with openly available training data, code, and model weights, designed to support research and public accountability in AI development. The institute's projects are moving in 'one big direction' — toward a unified AI model 'that can do reasoning and language, that can understand images, videos, that can control a robot, and that can make sense of space and actions,' said Ranjay Krishna, Ai2's research lead for computer vision, and a University of Washington Allen School assistant professor. MolmoAct builds on AI2's Molmo multimodal AI model — which can understand and describe images — by adding the ability to reason in 3D and direct robot actions.

Business Wire

2 days ago

Business
Business Wire

Ai2 Unveils MolmoAct, a New Class of AI Model That Reasons in 3D Space

SEATTLE--(BUSINESS WIRE)-- Ai2 (The Allen Institute for AI) today announced the release of MolmoAct 7B, a breakthrough embodied AI model that brings the intelligence of state of the art AI models into the physical world. Instead of reasoning through language and converting that into movement, MolmoAct actually sees its surroundings, understands the relationships between space, movement and time, and plans its movements accordingly. It does this by generating visual reasoning tokens that transform 2D image inputs into 3D spatial plans—enabling robots to navigate the physical world with greater intelligence and control. 'With MolmoAct, we're laying the groundwork for a new era of AI—one that can reason and navigate the world in ways more aligned with human thinking, and collaborate with us safely and effectively.' -Ali Farhadi, CEO of Ai2 Share While spatial reasoning isn't new, most modern systems rely on closed, end-to-end architectures trained on massive proprietary datasets. These models are difficult to reproduce, expensive to scale, and often operate as opaque black boxes. MolmoAct offers a fundamentally different approach: it's trained entirely on open data, designed for transparency, and built for real-world generalization. Its step-by-step visual reasoning traces make it easy to preview what a robot plans to do and intuitively steer its behavior in real time as conditions change. 'Embodied AI needs a new foundation that prioritizes reasoning, transparency, and openness,' said Ali Farhadi, CEO of Ai2. 'With MolmoAct, we're not just releasing a model; we're laying the groundwork for a new era of AI, bringing the intelligence of powerful AI models into the physical world. It's a step toward AI that can reason and navigate the world in ways that are more aligned with how humans do — and collaborate with us safely and effectively.' A New Class of Model: Action Reasoning MolmoAct is the first in a new category of AI model Ai2 is calling an Action Reasoning Model (ARM), a model that interprets high-level natural language instructions and reasons through a sequence of physical actions to carry them out in the real world. Unlike traditional end-to-end robotics models that treat tasks as a single, opaque step, ARMs interpret high-level instructions and break them down into a transparent chain of spatially grounded decisions: 3D-aware perception: grounding the robot's understanding of its environment using depth and spatial context Visual waypoint planning: outlining a step-by-step task trajectory in image space Action decoding: converting the plan into precise, robot-specific control commands This layered reasoning enables MolmoAct to interpret commands like 'Sort this trash pile' not as a single step, but as a structured series of sub-tasks: recognize the scene, group objects by type, grasp them one by one, and repeat. Built to Generalize and Trained to Scale MolmoAct 7B, the first in its model family, was trained on a curated dataset of about 12,000 'robot episodes' from real-world environments, such as kitchens and bedrooms. These demonstrations were transformed into robot-reasoning sequences that expose how complex instructions map to grounded, goal-directed actions. Along with the model, we're releasing the MolmoAct post-training dataset containing ~12,000 distinct 'robot episodes.' Ai2 researchers spent months curating videos of robots performing actions in diverse household settings, from arranging pillows on a living room couch to putting away laundry in a bedroom. Despite its strong performance, MolmoAct was trained with striking efficiency. It required just 18 million samples, pretraining on 256 NVIDIA H100 GPUs for about 24 hours, and fine-tuning on 64 GPUs for only two more. In contrast, many commercial models require hundreds of millions of samples and far more compute. Yet MolmoAct outperforms many of these systems on key benchmarks—including a 71.9% success rate on SimPLER—demonstrating that high-quality data and thoughtful design can outperform models trained with far more data and compute. Understandable AI You Can Build On Unlike most robotics models, which operate as opaque systems, MolmoAct was built for transparency. Users can preview the model's planned movements before execution, with motion trajectories overlaid on camera images. These plans can be adjusted using natural language or quick sketching corrections on a touchscreen—providing fine-grained control and enhancing safety in real-world environments like homes, hospitals, and warehouses. True to Ai2's mission, MolmoAct is fully open-source and reproducible. Ai2 is releasing everything needed to build, run, and extend the model: training pipelines, pre- and post-training datasets, model checkpoints, and evaluation benchmarks. MolmoAct sets a new standard for what embodied AI should look like—safe, interpretable, adaptable, and truly open. Ai2 will continue expanding its testing across both simulated and real-world environments, with the goal of enabling more capable and collaborative AI systems. Download the model and model artifacts – including training checkpoints and evals – from Ai2's Hugging Face repository. About Ai2 Ai2 is a Seattle-based non-profit AI research institute with the mission of building breakthrough AI to solve the world's biggest problems. Founded in 2014 by the late Paul G. Allen, Ai2 develops foundational AI research and innovative new applications that deliver real-world impact through large-scale open models, open data, robotics, conservation platforms, and more. Ai2 champions true openness through initiatives like OLMo, the world's first truly open language model framework, Molmo, a family of open state-of-the-art multimodal AI models, and Tulu, the first application of fully open post-training recipes to the largest open-weight models. These solutions empower researchers, engineers, and tech leaders to participate in the creation of state-of-the-art AI and to directly benefit from the many ways it can advance critical fields like medicine, scientific research, climate science, and conservation efforts. For more information, visit

A New Kind of AI Model Lets Data Owners Take Control

WIRED

09-07-2025

Business
WIRED

A New Kind of AI Model Lets Data Owners Take Control

Jul 9, 2025 1:59 PM A novel approach from the Allen Institute for AI enables data to be removed from an artificial intelligence model even after it has already been used for training. Photo-Illustration:A new kind of large language model, developed by researchers at the Allen Institute for AI (Ai2), makes it possible to control how training data is used even after a model has been built. The new model, called FlexOlmo, could challenge the current industry paradigm of big artificial intelligence companies slurping up data from the web, books, and other sources—often with little regard for ownership—and then owning the resulting models entirely. Once data is baked into an AI model today, extracting it from that model is a bit like trying to recover the eggs from a finished cake. 'Conventionally, your data is either in or out,' says Ali Farhadi, CEO of Ai2, based in Seattle, Washington. 'Once I train on that data, you lose control. And you have no way out, unless you force me to go through another multi-million-dollar round of training.' Ai2's avant-garde approach divides up training so that data owners can exert control. Those who want to contribute data to a FlexOlmo model can do so by first copying a publicly shared model known as the 'anchor.' They then train a second model using their own data, combine the result with the anchor model, and contribute the result back to whoever is building the third and final model. Contributing in this way means that the data itself never has to be handed over. And because of how the data owner's model is merged with the final one, it is possible to extract the data later on. A magazine publisher might, for instance, contribute text from its archive of articles to a model but later remove the sub-model trained on that data if there is a legal dispute or if the company objects to how a model is being used. 'The training is completely asynchronous,' says Sewon Min, a research scientist at Ai2 who led the technical work. 'Data owners do not have to coordinate, and the training can be done completely independently.' The FlexOlmo model architecture is what's known as a 'mixture of experts,' a popular design that is normally used to simultaneously combine several sub-models into a bigger, more capable one. A key innovation from Ai2 is a way of merging sub-models that were trained independently. This is achieved using a new scheme for representing the values in a model so that its abilities can be merged with others when the final combined model is run. To test the approach, the FlexOlmo researchers created a dataset they call Flexmix from proprietary sources including books and websites. They used the FlexOlmo design to build a model with 37 billion parameters, about a tenth of the size of the largest open source model from Meta. They then compared their model to several others. They found that it outperformed any individual model on all tasks and also scored 10 percent better at common benchmarks than two other approaches for merging independently trained models. The result is a way to have your cake—and get your eggs back, too. 'You could just opt out of the system without any major damage and inference time,' Farhadi says. 'It's a whole new way of thinking about how to train these models.' Percy Liang, an AI researcher at Stanford, says the Ai2 approach seems like a promising idea. 'Providing more modular control over data—especially without retraining—is a refreshing direction that challenges the status quo of thinking of language models as monolithic black boxes,' he says. 'Openness of the development process—how the model was built, what experiments were run, how decisions were made—is something that's missing.' Farhadi and Min say that the FlexOlmo approach might also make it possible for AI firms to access sensitive private data in a more controlled way, because that data does not need to be disclosed in order to build the final model. However, they warn that it may be possible to reconstruct data from the final model, so a technique like differential privacy, which allows data to be contributed with mathematically guaranteed privacy, might be required to ensure data is kept safe. Ownership of the data used to train large AI models has become a big legal issue in recent years. Some publishers are suing large AI companies while others are cutting deals to grant access to their content. (WIRED parent company Condé Nast has a deal in place with OpenAI.) In June, Meta won a major copyright infringement case when a federal judge ruled that the company did not violate the law by training its open source model on text from books by 13 authors. Min says it may well be possible to build new kinds of open models using the FlexOlmo approach. 'I really think the data is the bottleneck in building the state of the art models,' she says. 'This could be a way to have better shared models where different data owners can codevelop, and they don't have to sacrifice their data privacy or control.'

Vercept Raises $16M With Google Legend And Dropbox Co-Founder Backing To Build Future Of Workflows With Vision-Based Mac App Called Vy

Yahoo

12-06-2025

Business
Yahoo

Vercept Raises $16M With Google Legend And Dropbox Co-Founder Backing To Build Future Of Workflows With Vision-Based Mac App Called Vy

Vercept, a Seattle-based AI startup, has secured $16 million in seed funding to develop Vy, a computer vision-powered Mac application designed to automate digital workflows with a single natural language command, GeekWire reports. The company was founded by a group of former leaders from the Allen Institute for AI, and according to GeekWire, its backers include some of the most prominent names in tech, such as former Google CEO Eric Schmidt, Google DeepMind chief scientist Jeff Dean, Dropbox (NASDAQ:DBX) co-founder Arash Ferdowsi, and Cruise founder Kyle Vogt. Don't Miss: Maker of the $60,000 foldable home has 3 factory buildings, 600+ houses built, and big plans to solve housing — 'Scrolling To UBI' — Deloitte's #1 fastest-growing software company allows users to earn money on their phones. The round was led by San Francisco-based venture firm Fifty Years, with participation from Point Nine and the AI2 Incubator, which was Vercept's first institutional investor, GeekWire says. Vy Uses Vision AI To Mimic Human Computer Interaction Vercept's flagship product, Vy, uses artificial intelligence to "see" and interpret screens the way a human does, allowing it to replicate complex workflows after observing them once, GeekWire reports. According to the company's website, Vercept was founded with the goal of radically rethinking how people interact with technology, aiming to replace the maze of menus and code-heavy workflows with a seamless, intuitive interface that feels like an extension of the user's mind. The company describes its mission as enabling users to do more with less effort, tackling tasks that were once considered too technical or time-consuming to attempt. Users can perform any digital task, such as filling out forms, organizing invoices, or creating content, while Vy records the actions, and then it automates those same tasks using natural language commands. Trending: Invest where it hurts — and help millions heal:. Unlike traditional robotic process automation tools, GeekWire says that Vy does not require pre-built application programming interfaces, connectors, or hardcoded steps to engage with software. Vercept CEO Kiana Ehsani told GeekWire that the product is a 'unified paradigm for interacting with the computer.' Ehsani previously led robotics and embodied AI projects at Ai2, while other Vercept co-founders include Oren Etzioni, the founding CEO of Ai2, and Matt Deitke, who worked on prominent AI projects like Molmo, ProcTHOR, and Objaverse, GeekWire reports. Luca Weihs, another co-founder, was a research manager and infrastructure lead at Ai2, focusing on AI agents and reinforcement learning. According to GeekWire, Ross Girshick, a pioneer in combining deep learning and computer vision, also joined the founding team after stints at Meta (NASDAQ:META) AI and has already found traction with a wide range of early adopters, from students using it to manage assignments to businesses automating administrative workflows. In one example, GeekWire reports that individuals with disabilities have integrated Vy with speech-to-text systems, allowing them to remotely operate computers and complete digital tasks independently. While user growth and revenue figures were not disclosed, Ehsani told GeekWire that the reception to Vy has exceeded expectations. Vercept currently employs eight full-time staff members. As major tech players like OpenAI, Google, and Amazon (NASDAQ:AMZN) explore generative AI tools for task automation, GeekWire says that Vercept is differentiating itself with a visual-first solution that requires no technical setup or custom coding. Read Next: Here's what Americans think you need to be considered wealthy. Deloitte's fastest-growing software company partners with Amazon, Walmart & Target – Image: Midjourney Up Next: Transform your trading with Benzinga Edge's one-of-a-kind market trade ideas and tools. Click now to access unique insights that can set you ahead in today's competitive market. Get the latest stock analysis from Benzinga? APPLE (AAPL): Free Stock Analysis Report TESLA (TSLA): Free Stock Analysis Report This article Vercept Raises $16M With Google Legend And Dropbox Co-Founder Backing To Build Future Of Workflows With Vision-Based Mac App Called Vy originally appeared on © 2025 Benzinga does not provide investment advice. All rights reserved. Sign in to access your portfolio

Google Cloud and Ai2 Commit $20M to Advance AI-Powered Research for the Cancer AI Alliance

Associated Press

04-04-2025

Health
Associated Press

Google Cloud and Ai2 Commit $20M to Advance AI-Powered Research for the Cancer AI Alliance

Google Cloud to provide advanced and secure technology, while Ai2 to lead AI training and development for AI cancer models SEATTLE and SUNNYVALE, Calif., April 4, 2025 /PRNewswire/ -- Today, Google Cloud and Ai2 announced they have partnered with the Cancer AI Alliance (CAIA), a pioneering consortium uniting leading cancer research institutions and technology companies to harness artificial intelligence (AI) in the fight against cancer. Google Cloud and Ai2 are each giving $10 million to the initiative and providing access to technology solutions that will help speed scientific discovery. Google Cloud will power planet-scale AI infrastructure and data analytics tools, while Ai2 will provide critical expertise in training large-scale models focusing on cancer research. 'The Cancer AI Alliance represents a major advancement in harnessing AI to transform cancer discovery and research,' said Reymund Dumlao, Director, State and Local Government and Education, Google Public Sector. 'Google Cloud's planet-scale AI infrastructure and analytics combined with Ai2's mission to make open models more accessible will help accelerate research breakthroughs and drive improvements in patient outcomes.' AI has the potential to radically advance cancer research, helping accelerate the discovery of cures or more effective treatments. However, this is only possible with access to critical data aligned across cancers, treatments, institutions, and medical professionals. Today, cancer AI models are often limited by both the breadth and depth of data and not easily transferable between institutions, limiting the potential for a single model to make a broader impact in the field of cancer research. 'The potential for AI to advance healthcare is immense, and CAIA represents a significant step toward applying AI to some of the most challenging problems in cancer and AI research,' said Ali Farhadi, CEO of Ai2. 'At Ai2, we are not only committed to building state-of-the-art AI models, but also to creating open, scalable systems that allow cancer centers to collaborate in a distributed and private way. For the first time, cancer centers are bringing their data together, and it's imperative we architect data preparation and model training in a way that protects patient privacy while demonstrating the advancements we make by sharing data effectively and securely. Our open, yet privacy-protected distributed approach allows researchers and clinicians to build on and learn from AI models without the need to directly share data.' CAIA, spearheaded by Seattle's Fred Hutch Cancer Center in collaboration with top cancer research institutes and tech leaders, aims to create a novel AI infrastructure that is open, which is critical for the research, while keeping data private and secure. By bringing together leading cancer centers, CAIA will develop generalizable AI models that in the future can be shared across institutions, from large centers to smaller regional hospitals. 'The addition of Google Cloud and AI2 to CAIA builds on our incredible momentum toward safely and swiftly unlocking the next generation of critical insights in cancer treatment and care,' said Jeff Leek, chief data officer at Fred Hutch Cancer Center and holder of the J. Orin Edson Foundation Endowed Chair. 'Their generous contribution of AI and computing expertise and resources, when combined with the technical and scientific prowess of our collective partners, will play a key role in creating the world's most advanced cancer AI laboratory and dramatically accelerate cancer research and improve patient outcomes.' Leveraging Google Cloud infrastructure, Ai2 will play a leading role in AI training efforts across the Alliance, collaborating with individual cancer centers to develop and refine AI models tailored to their specific needs. These models will be designed to scale across institutions, ensuring robust data privacy and security by implementing advanced techniques to protect sensitive, institute-specific data while fostering collaboration and innovation. The Alliance is committed to developing AI models that can analyze vast amounts of anonymized patient data without compromising privacy, ensuring that AI-powered insights are both impactful and secure. With Ai2 leading AI model development, powered by Google Cloud's infrastructure and computing technology that enables collaboration on large data sets, CAIA is primed to drive rapid progress in AI-powered cancer research. About Google Cloud Google Cloud is the new way to the cloud, providing AI, infrastructure, developer, data, security, and collaboration tools built for today and tomorrow. Google Cloud offers a powerful, fully integrated and optimized AI stack with its own planet-scale infrastructure, custom-built chips, generative AI models and development platform, as well as AI-powered applications, to help organizations transform. Customers in more than 200 countries and territories turn to Google Cloud as their trusted technology partner. About Ai2 Ai2 is a Seattle-based non-profit AI research institute with the mission of building breakthrough AI to solve the world's biggest problems. Founded in 2014 by the late Paul G. Allen, Ai2 develops foundational AI research and innovative new applications that deliver real-world impact through large-scale open models, open data, robotics, conservation platforms, and more. Ai2 champions true openness with ambitious projects like OLMo, the world's first truly open language model framework, empowering others to participate in the creation of state-of-the-art AI and to directly benefit from the many ways it can advance critical fields like medicine, scientific research, climate science, and conservation efforts. For more information, visit