Monday, November 18, 2024
ad
HomeNewsIntel Labs Generative AI Model LDM3D Generates 360-Degree Images from Text Prompts

Intel Labs Generative AI Model LDM3D Generates 360-Degree Images from Text Prompts

The diffusion technique is used by LDM3D to develop a depth map that results in vibrant, immersive 3D images with 360-degree vistas.

Latent Diffusion Model for 3D (LDM3D) is a unique diffusion model that employs generative AI to produce realistic 3D visual content. It was developed by Intel Labs in partnership with Blockade Labs. The diffusion technique is used by LDM3D, the first model in the market, to develop a depth map that results in vibrant, immersive 3D images with 360-degree vistas. 

With the help of this research, users will be able to interact with their text prompts in previously unimaginable ways, revolutionizing the way we interact with digital content. Users can convert a literary description of a calm tropical beach, a contemporary skyscraper, or a sci-fi cosmos into a 360-degree detailed panorama using the photos and depth maps produced by LDM3D. 

A subset of 10,000 samples from the LAION-400M database, which comprises more than 400 million image-caption pairs, served as the basis for the dataset used to train LDM3D. The researchers annotated the training corpus using the Dense Prediction Transformer (DPT) large-depth estimation model, which was previously created at Intel Labs.

Read More: Microsoft Announces AI Personal Assistant Windows Copilot for Windows 11

For every pixel in a picture, the DPT-large model delivers incredibly accurate relative depth. The LAION-400M dataset was created for research purposes to allow for model training on a bigger scale for the benefit of various research communities. An Intel AI supercomputer with Intel Xeon processors and Intel Habana Gaudi AI accelerators is used to train the LDM3D model. To create 360-degree views for immersive experiences, the final model and pipeline integrate the generated RGB image and depth map.

Intel and Blockade researchers created DepthFusion, a programme that uses common 2D RGB photographs and depth maps to produce realistic and interactive 360-degree view experiences, to show the potential of LDM3D. Text prompts are transformed into engaging digital experiences by DepthFusion using TouchDesigner, a node-based visual programming language for real-time interactive multimedia content. 

Subscribe to our newsletter

Subscribe and never miss out on such trending AI-related articles.

We will never sell your data

Join our WhatsApp Channel and Discord Server to be a part of an engaging community.

Sahil Pawar
Sahil Pawar
I am a graduate with a bachelor's degree in statistics, mathematics, and physics. I have been working as a content writer for almost 3 years and have written for a plethora of domains. Besides, I have a vested interest in fashion and music.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular