NVIDIA Launched TensorRT8 For Faster Artificial Intelligence Inference

July 21, 2021

NVIDIA has recently announced that it has launched its new TensorRT8 that will enable faster artificial intelligence inference.

The new platform will allow enterprises to build artificial intelligence-powered interactive language applications like search engines and chatbots on cloud. The eighth generation of NIVIDIA’s artificial intelligence software TensorRT8 will be available to all the NVIDIA developers at no extra cost. The new updated software has the capability to reduce latency to 1.2 ms and will also deliver two times more accuracy for INT8 precision with Quantization Aware Training.

NVIDIA’s vice president of developer programs, Greg Estes, said, “Artificial intelligence models are growing exponentially more complex, and worldwide demand is surging for real-time applications that use AI. That makes it imperative for enterprises to deploy state-of-the-art inferencing solutions.”

He further added that the latest version of the software would enable enterprises to develop highly responsive conversational artificial intelligence programs for their customers. NVIDIA’s software is being used by enterprises of various industries like manufacturing, automobile, healthcare, finance, and communication.

NVIDIA has optimized its transformers to enable TensorRT 8.0 to attain best-in-class speed. The company has also introduced a new Graphic Processing Unit architecture called Sparsity that reduces computational operation to increase the speed of neural networks.

Jeff Boudier, product director of Hugging Face, said, “We’re closely collaborating with NVIDIA to deliver the best possible performance for state-of-the-art models on NVIDIA GPUs.” he also mentioned that the company’s accelerated API delivers hundred times speedup for transformer models that use NVIDIA GPUs.
WeChat, one of China’s largest messaging platforms, used NVIDIA’s Tensor platform to develop its search engine that has a user base of more than 500 million users. WeChat officials said that they were able to achieve a reduction of 70% in computational resources using NVIDIA’s artificial intelligence-powered platform.

NVIDIA Launched TensorRT8 For Faster Artificial Intelligence Inference

LEAVE A REPLY Cancel reply

Most Popular

How Telematics Data Is Driving Greater Efficiency in Freight Transportation

NVIDIA Launched TensorRT8 For Faster Artificial Intelligence Inference

Subscribe to our newsletter

RELATED ARTICLES

Yann LeCun Launches AMI Labs

GitHub CEO Thomas Dohmke Resigns to Return to Startup Life

Google Rolls Out Deep Think in Gemini App to Power Ultra‑Reasoning AI

LEAVE A REPLY Cancel reply

Most Popular

How Telematics Data Is Driving Greater Efficiency in Freight Transportation