From September 19 to September 22, NVIDIA says it will virtually host the upcoming GTC conference, featuring a keynote address from CEO Jensen Huang and more than 200 tech sessions. However, before the most awaited event by NVIDIA, the company is already making headlines with the latest announcements in tech, especially at SIGGRAPH 2022–the largest gathering of computer graphics professionals in the world.
At SIGGRAPH 2022, NVIDIA described Universal Scene Description (USD), created by Pixar, a Disney company, as the metaverse’s equivalent of HTML in their opening presentation. NVIDIA also discussed the metaverse’s future, in which avatars would resemble browsers and provide a new level of human-machine interaction. Additionally, the company introduced the idea of neural graphics, which mainly relies on AI to produce more realistic metaverse graphics with much less effort.
NVIDIA believes that neural graphics will play an integral role in metaverse by using AI to learn from data to create powerful graphics. Integrating AI improves outcomes, automates design decisions, and offers previously unimagined potential for makers and artists. Neural graphics will revolutionize virtual environment creation, simulation, and user experience.
NVIDIA acknowledges that a professional artist must balance photorealism and detail against time constraints and financial constraints when creating 3D objects for use in creating scenes for video games, virtual worlds (including the metaverse), product design, or visual effects. They must balance photorealism and detail against time constraints and financial constraints.
Based on these concerns, the company published new research and a comprehensive set of tools that use the power of neural graphics to create and animate 3D objects and environments in order to streamline and accelerate this process. In this article, we will discuss some of the groundbreaking product announcements from NVIDIA.
One of the major problems with virtual reality (VR) experiences—bulky headsets—was addressed in a recent study report by Stanford University and NVIDIA researchers. They demonstrated how these headsets might be thinned down to the thickness of a pair of ordinary-looking spectacles. It is based on the idea of pancake lenses, which were made possible with NVIDIA’s assistance so they could be used with three-dimensional (3D) pictures. In a joint effort with the Stanford team, NVIDIA was also successful in reducing the distance between the lens and the display. The latter was accomplished using a phase-only spatial light modulator (SLM) that creates a small image behind the device and is illuminated by a coherent light source.
It is a holographic near-eye display device known as Holographic Glasses that uses a pupil-replicating waveguide, a spatial light modulator, and a geometric phase lens to produce holographic pictures in a thin and light design. The suggested concept uses a 2.5 mm thick optical stack to deliver full-color 3D holographic pictures. The researchers introduced a brand-new pupil-high-order gradient descent approach to calculate the right phase when the user’s pupil size varies. The prototype for a wearable pair of binoculars supports 3D focus cues and offers a diagonal field of vision of 22.8° with a 2.3 mm static eye box and the option of a dynamic eye box with beam steering, all while weighing only 60 g without the driving board.
At SIGGRAPH 2022, NVIDIA launched NeuralVDB in an effort to condense neural representations and significantly reduce memory footprint to enable higher-resolution 3D data.
By utilizing recent developments in machine learning, NeuralVDB improves on an established industry standard for effective storage of sparse volumetric data, or VDB. While preserving flexibility and only incurring a limited number of (user-controlled) compression errors, this revolutionary hybrid data structure drastically reduces the memory footprints of VDB volumes.
According to NVIDIA, NeuralVDB will introduce AI capability to OpenVDB, the industry-standard framework for modeling and displaying sparse volumetric data such as water, fire, smoke, and clouds. Building on this foundation, NVIDIA released GPU-accelerated processing with NanoVDB last year for much-increased performance. With the addition of machine learning, NeuralVDB expands on NanoVDB’s GPU acceleration by introducing compact neural representations that significantly lower its memory footprint. This makes it possible to express 3D data at a considerably greater resolution and scale than OpenVDB. Users are now able to manage large volumetric information with ease on devices like laptops and individual workstations.
In a nutshell, NeuralVDB inherits all of its predecessors’ features and optimizations while introducing a structure of compact neural representations that decreases memory footprints by up to 100x.
In addition, NeuralVDB permits the use of a frame’s weights for the following one, accelerating training by up to 2x. By leveraging the network findings from the preceding frame, NeuralVDB also allows users to achieve temporal coherency, or smooth encoding.
For experts working in fields like scientific computing and visualization, medical imaging, rocket science, and visual effects, the launch of NeuralVDB at SIGGRAPH is a game-changer. NeuralVDB can open up new possibilities for scientific and industrial use cases by achieving the perfect mix of significantly lowering memory requirements, speeding up training, and enabling temporal coherency, including large-scale digital twin simulations and massive, complex volume datasets for AI-enabled medical imaging.
NVIDIA Kaolin Wisp is a PyTorch library that uses NVIDIA Kaolin Core to interact with neural fields (including NeRFs, NGLOD, instant-ngp and VQAD). By cutting the time needed to test and deploy new approaches from weeks to days, it enables faster 3D deep learning research.
NVIDIA shares that Kaolin Wisp seeks to provide a set of common utility functions for doing neural field research. This comprises utility functions for rays, mesh processing, datasets, and image I/O. Wisp also includes building pieces for creating sophisticated neural networks, such as differentiable renderers and differentiable data structures (such octrees, hash grids, and triplanar features). Additionally, it has interactive rendering and training, logging, trainer modules, and debugging visualization tools.
Omniverse Avatar Cloud Engine
NVIDIA founder and CEO Jensen Huang stated at SIGGRAPH 2022 that the metaverse, the next chapter of the internet, will be propelled by the fusion of artificial intelligence and computer graphics. Metaverse will also be populated by one of the most popular forms of robots: digital human avatars. These avatars will work in virtual workplaces, participate in online games, and offer customer support to online retailers.
Such digital avatars need to be developed with millisecond precision in natural language processing, computer vision, complicated face and body motions, and other areas. With the Omniverse Avatar Cloud Engine, NVIDIA hopes to streamline and expedite this process.
The Omniverse Avatar Cloud Engine is a brand-new set of cloud APIs, microservices, and tools for building, personalizing, and delivering apps for digital human avatars. Because ACE is based on NVIDIA’s Unified Compute Framework, developers can easily include key NVIDIA AI capabilities into their avatar applications. Additionally, Omniverse ACE enables you to create and flexibly deploy your Avatar to meet your demands, regardless of whether you have real-time or offline requirements.
Developers could employ Avatar Cloud Engine to bring their avatars to life by leveraging powerful software tools and APIs such as NVIDIA Riva for generating speech AI applications, NVIDIA Merlin for high-performance recommender systems, NVIDIA Metropolis for computer vision and advanced video analytics, and NVIDIA NeMo Megatron for natural language understanding. They can also use the Omniverse ACE platform to create intelligent AI-service agents with Project Tokkio. With the help of Project Tokkio, an application created with Omniverse ACE, retail establishments, quick-service eateries, as well as the web will all benefit from AI-powered customer service.
To create 8K, 360-degree settings that can be imported into Omniverse scenes, Nvidia has introduced GauGAN360, a new experimental online art tool. The software, which uses the same technology as Nvidia’s original GauGAN AI painting software, enables users to paint an overall landscape, and GauGAN360 will produce a cube map or an equirectangular image that corresponds the sketch.
It is an artificial intelligence software that creates expressive face animation from a single audio source. Audio2Face can be used to create standard facial animations as well as interactive real-time applications and retarget to any 3D human or human-esque face, whether realistic or stylized.
According to NVIDIA, getting started is easy since Audio2Face comes prepackaged with “Digital Mark,” a 3D character model that can be animated with your audio file. All you have to do is choose your music and upload it. The output of the pre-trained Deep Neural Network is then fed into the 3D vertices of your character’s mesh to control the face movement in real-time. Additionally, you can change several post-processing factors to alter how your character behaves.
NVIDIA also released TAO Toolkit at SIGGRAPH, a framework that enables developers to build an accurate, high-performance pose estimate model. This model can assess what a person could be doing in a picture using computer vision considerably more quickly than existing approaches. By abstracting away the complexity of the AI/deep learning framework, this low-code variant of the NVIDIA TAO framework speeds up the model training process with Jupyter notebooks. You can use the TAO Toolkit to optimize inference and fine-tune NVIDIA pre-trained models with your own data without having any prior AI knowledge or access to a large training dataset.
Developers can use TAO Toolkit to deploy optimized models using NVIDIA DeepStream for vision AI, Riva for speech AI, and Triton Inference Server. They can also deploy it in a modern cloud-native infrastructure on Kubernetes and integrate it in their application of choice with REST APIs.
Instant Neural Graphics Primitives: NVIDIA Instant Neural Graphics Primitives is a revolutionary technique to capture the form of real-world objects that serves as the basis for NVIDIA Instant NeRF, an inverse rendering model that converts a collection of still photos into a digital 3D scene. For its significance to the future of computer graphics research, the research that formed the basis of Instant NeRF was honored at SIGGRAPH as the best paper.
Other Key Annoucements
NVIDIA Jetson AGX Orin
On August 3rd, 2022, NVIDIA announced the NVIDIA Jetson AGX Orin 32GB production modules. The Jetson AGX Orin 32GB combines an 8-core Arm-based CPU with an Ampere-based GPU to allow AI acceleration in embedded systems, edge AI deployments, and robotics. The device, which has 64GB of flash storage and 32GB of RAM, is just marginally bigger than a Raspberry Pi.
The Jetson AGX Orin 32GB unit that was unveiled earlier this year, can perform 200 trillion operations per second (TOPS). Its production module will have a 1792-core GPU with 45 Tensor Cores and will include an array of connectivity options such as 10Gb Ethernet, 8K display, and PCIe 4.0 lanes. According to NVIDIA, the 64GB version will be available in November and a pair of comparatively powerful Jetson Orin NX production modules are coming later this year.
The Jetson AGX Orin developer kit enables several concurrent AI application pipelines with its NVIDIA Ampere architecture GPU, next-generation deep learning and vision accelerators, high-speed I/O, and quick memory bandwidth. Customers can create solutions employing the biggest and most intricate AI models with Jetson AGX Orin to address issues like natural language comprehension, 3D vision, and multi-sensor fusion.
In addition to JetPack SDK, which incorporates the NVIDIA CUDA-X accelerated stack, Jetson Orin supports a variety of NVIDIA platforms and frameworks, including Riva for natural language understanding, Isaac for robotics, TAO Toolkit to speed up model development with pre-trained models, DeepStream for computer vision, and Metropolis, an application framework, collection of developer tools, and partner ecosystem that combines visual data and AI to increase operational efficiency and safe operations.
By modeling any Jetson Orin-based production module on the Jetson AGX Orin developer kit first, customers can market their next-generation cutting-edge AI and robotics solutions considerably more quickly.
Antonio Serrano-Muoz, a Ph.D. student in applied robotics at Mondragon University in northern Spain, developed an Omniverse Extension to use the Robot Operating System application with NVIDIA Isaac Sim. Omniverse Extensions are fundamental building blocks that anybody may use to develop and expand the functionality of Omniverse Apps to suit their unique workflow requirements with just a few lines of Python code.
One of the six open-source, GitHub-accessible Omniverse Extensions developed by Serrano-Muoz expands the capabilities of NVIDIA Isaac Sim, an application framework powered by Omniverse that allows users to build photorealistic, physically correct virtual worlds to design, test, and train AI robots.
Antonio also built a digital twin of the robotics prototype lab at Mondragon University and robot simulations for industrial use cases using NVIDIA Omniverse.