NVIDIA has recently announced that it has launched its new TensorRT8 that will enable faster artificial intelligence inference.
The new platform will allow enterprises to build artificial intelligence-powered interactive language applications like search engines and chatbots on cloud. The eighth generation of NIVIDIA’s artificial intelligence software TensorRT8 will be available to all the NVIDIA developers at no extra cost. The new updated software has the capability to reduce latency to 1.2 ms and will also deliver two times more accuracy for INT8 precision with Quantization Aware Training.
NVIDIA’s vice president of developer programs, Greg Estes, said, “Artificial intelligence models are growing exponentially more complex, and worldwide demand is surging for real-time applications that use AI. That makes it imperative for enterprises to deploy state-of-the-art inferencing solutions.”
Read More: WeRide Acquired Self-Driving Autonomous Trucking Company MoonX.AI
He further added that the latest version of the software would enable enterprises to develop highly responsive conversational artificial intelligence programs for their customers. NVIDIA’s software is being used by enterprises of various industries like manufacturing, automobile, healthcare, finance, and communication.
NVIDIA has optimized its transformers to enable TensorRT 8.0 to attain best-in-class speed. The company has also introduced a new Graphic Processing Unit architecture called Sparsity that reduces computational operation to increase the speed of neural networks.
Jeff Boudier, product director of Hugging Face, said, “We’re closely collaborating with NVIDIA to deliver the best possible performance for state-of-the-art models on NVIDIA GPUs.” he also mentioned that the company’s accelerated API delivers hundred times speedup for transformer models that use NVIDIA GPUs.
WeChat, one of China’s largest messaging platforms, used NVIDIA’s Tensor platform to develop its search engine that has a user base of more than 500 million users. WeChat officials said that they were able to achieve a reduction of 70% in computational resources using NVIDIA’s artificial intelligence-powered platform.