NVIDIA releases Maxine, a suite of GPU-driven software development kits (SDKs) to deliver breakthrough audio and video quality. Maxine enables clear communications via its cloud-native microservices for augmented-reality effects and audio-video enhancement.
With the early-access release of Maxine’s audio effects, the company said that Maxine would be re-architected for cloud-native microservices. Additionally, new SDK capabilities, including Speaker Focus and Face Expression Estimation, were announced, along with the availability of Eye Contact to all users. Updated versions of existing SDK functionalities are also included in NVIDIA Maxine.
Maxine provides three updated GPU-accelerated SDKs for audio, video, and AR effects that revolutionize real-time communications with AI. A new feature called Speaker Focus isolates the audio tracks of foreground and background speakers to make each voice more audible. Lastly, the Audio Super Resolution SDK function has also received an upgrade with better quality.
Read More: New NVIDIA DGX System Software and Infrastructure Solutions Supercharge Enterprise AI
The video effects SDK uses a regular webcam to produce AI-based video effects. Enhancements to temporal stability have been made to the Virtual Background function, which divides a person’s profile into sections and uses AI-powered background removal, replacement, or blur.
Additionally, the AR SDK offers typical web camera feed-based, real-time 3D face tracking and body pose estimation driven by AI.
Other cloud-native microservices offered by Maxine will enable developers to create real-time AI applications. These services may be autonomously managed and deployed on the cloud, speeding up implementation time. Some of these microservices are:
- Background Noise Removal
- Room Echo Removal
- Audio Super Resolution
- Acoustic Echo Cancellation
Maxine is a part of the NVIDIA Omniverse Avatar Cloud Engine, a set of cloud-based AI models and services that developers may use to create, personalize, and use interactive avatars. You can refer to the GTC keynote for more information.