Intel Labs Generative AI Model LDM3D Generates 360-Degree Images from Text Prompts

The diffusion technique is used by LDM3D to develop a depth map that results in vibrant, immersive 3D images with 360-degree vistas.

By Sahil Pawar

June 22, 2023

Latent Diffusion Model for 3D (LDM3D) is a unique diffusion model that employs generative AI to produce realistic 3D visual content. It was developed by Intel Labs in partnership with Blockade Labs. The diffusion technique is used by LDM3D, the first model in the market, to develop a depth map that results in vibrant, immersive 3D images with 360-degree vistas.

With the help of this research, users will be able to interact with their text prompts in previously unimaginable ways, revolutionizing the way we interact with digital content. Users can convert a literary description of a calm tropical beach, a contemporary skyscraper, or a sci-fi cosmos into a 360-degree detailed panorama using the photos and depth maps produced by LDM3D.

A subset of 10,000 samples from the LAION-400M database, which comprises more than 400 million image-caption pairs, served as the basis for the dataset used to train LDM3D. The researchers annotated the training corpus using the Dense Prediction Transformer (DPT) large-depth estimation model, which was previously created at Intel Labs.

For every pixel in a picture, the DPT-large model delivers incredibly accurate relative depth. The LAION-400M dataset was created for research purposes to allow for model training on a bigger scale for the benefit of various research communities. An Intel AI supercomputer with Intel Xeon processors and Intel Habana Gaudi AI accelerators is used to train the LDM3D model. To create 360-degree views for immersive experiences, the final model and pipeline integrate the generated RGB image and depth map.

Intel and Blockade researchers created DepthFusion, a programme that uses common 2D RGB photographs and depth maps to produce realistic and interactive 360-degree view experiences, to show the potential of LDM3D. Text prompts are transformed into engaging digital experiences by DepthFusion using TouchDesigner, a node-based visual programming language for real-time interactive multimedia content.

Intel Labs Generative AI Model LDM3D Generates 360-Degree Images from Text Prompts

LEAVE A REPLY Cancel reply

Most Popular

Expanding Intellectual Curiosity via Z-library Collections

Intel Labs Generative AI Model LDM3D Generates 360-Degree Images from Text Prompts

Subscribe to our newsletter

RELATED ARTICLES

Yann LeCun Launches AMI Labs

GitHub CEO Thomas Dohmke Resigns to Return to Startup Life

Google Rolls Out Deep Think in Gemini App to Power Ultra‑Reasoning AI

LEAVE A REPLY Cancel reply

Most Popular

Expanding Intellectual Curiosity via Z-library Collections