The NVIDIA DGX supercomputer, powered by NVIDIA GH200 Grace Hopper Superchips and the NVIDIA NVLink Switch System, was unveiled by NVIDIA. It was designed to support the creation of enormous, next-generation models for generative AI language applications, recommender systems, and data analytics workloads.
The NVIDIA DGX GH200’s enormous shared memory area combines 256 GH200 superchips into one GPU using the NVLink connection technology and NVLink Switch System. This offers 144 terabytes of shared memory and 1 exaflop of performance, which is roughly 500 times more memory than the 2020-released NVIDIA DGX.
By integrating an Arm-based NVIDIA Grace CPU and an NVIDIA H100 Tensor Core GPU in the same device and connecting them with NVIDIA NVLink-C2C chip interconnects, GH200 superchips do away with the requirement for a conventional CPU-to-GPU PCIe connection. This offers a 600GB Hopper architecture GPU building block for DGX GH200 supercomputers, reduces interconnect power consumption by more than 5x, and boosts GPU and CPU bandwidth by 7x compared to the most recent PCIe technology.
Read More: Microsoft Announces AI Personal Assistant Windows Copilot for Windows 11
By offering 48 times more NVLink bandwidth than the previous generation, the DGX GH200 architecture combines the simplicity of programming a single GPU with the power of a powerful AI supercomputer. The DGX GH200 is scheduled to be made available to Google Cloud, Meta, and Microsoft first, allowing them to test its potential for generative AI workloads.
In order to allow cloud service providers and other hyperscalers to further customize the DGX GH200 architecture for their infrastructure, NVIDIA also plans to make the DGX GH200 design available to them as a blueprint. Supercomputers powered by the NVIDIA DGX GH200 are anticipated to become available by the end of the year.