Apple’s AI team is bringing a new AI system, GAUDI. The AI system can generate 3D indoor scenes based on input text prompts, making it a specialist for 3D interiors. Apple demonstrated that GAUDI could randomly generate new camera movements via 3D scenes. It could start from an image or prompt with an input like, “go down the stair.”
Neural rendering combines artificial intelligence with computer graphics. The technology of Neural Radiance Fields (NeRFs) has been utilized as a neural storage medium for 3D scenes and models. These models can be rendered via different camera angles.
Besides Apple, companies like Nvidia and Google are also experimenting with NeRFs to provide virtual reality experiences. With Nvidia’s 3D object creation from photos and Google’s Immersive indoor view via NeRFs, developers are exploring NeRFs’ ability of photorealism for generative AI.
AI systems based on this technology show the potential of controllable generative AI, but only for two-dimensional graphics. The limitation stems from the limited possibility of camera positions. Camera positions are restricted by obstacles like walls and objects when rendered in 3D.
Apple’s GAUDI model solves this problem with its three-tier specialized network. The network consists of a camera pose decoder (for predicting possible camera positions), a scene decoder (for predicting a tri-plane representation), and a radiance field decoder (for drawing the following image via volumetric rendering equation).
GAUDI’s video generation quality is yet to improve as it is still filled with artifacts. Apple is constantly working on its AI system and laying more foundations for generative AI to render #D objects and scenes.