logo
#

Latest news with #Matrix3D

Apple previews AI model that builds 3D scenes using images: How it works
Apple previews AI model that builds 3D scenes using images: How it works

Business Standard

time14-05-2025

  • Science
  • Business Standard

Apple previews AI model that builds 3D scenes using images: How it works

Apple has published a new research paper detailing an artificial intelligence (AI) model called Matrix3D. Developed in collaboration with researchers from Nanjing University and The Hong Kong University of Science and Technology, Matrix3D enables the reconstruction of detailed 3D scenes and objects using only a few 2D images. This marks a significant shift in how photogrammetry – an established technique for reconstructing 3D structures from photos – is approached. What is photogrammetry In its research paper, Apple noted that photogrammetry is a process of using 2D photographs to measure and recreate 3D structures or environments. Traditionally, this process has required hundreds of images taken from various angles and involves a multi-step pipeline using different algorithms for tasks like camera pose estimation (figuring out where each camera was when the photo was taken), depth prediction, and 3D model construction. How Matrix3D streamline photogrammetry process Apple's Matrix3D addresses two major challenges in traditional photogrammetry: the need for a large number of images from multiple angles, and the use of separate models for each stage of reconstruction. Matrix3D solves both problems by unifying the entire process into a single model. It can estimate camera positions, generate depth maps, and even synthesize novel views — all from just a few input images. How Matrix3D works At the heart of Matrix3D is a generative AI system based on diffusion transformers, similar to the models powering tools like OpenAI's DALL-E and ChatGPT. During training, the model uses a technique called masked learning, where parts of the input are deliberately hidden so the model learns to predict the missing data. This approach helps Matrix3D effectively handle sparse or incomplete input and significantly expands the range of usable training samples. As a result, Matrix3D can reconstruct detailed 3D objects or entire scenes using just two or three images. Availability and use case The researchers have published their work on arXiv and released the source code on GitHub. A companion website also features demo videos and interactive 3D reconstructions. While there's no official word yet, Matrix3D could eventually be integrated into Apple's Vision Pro headset, allowing users to transform regular 2D photos into immersive 3D experiences.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store