NVIDIA Launched TensorRT8 For Faster Artificial Intelligence Inference

July 21, 2021

NVIDIA has recently announced that it has launched its new TensorRT8 that will enable faster artificial intelligence inference.

The new platform will allow enterprises to build artificial intelligence-powered interactive language applications like search engines and chatbots on cloud. The eighth generation of NIVIDIA’s artificial intelligence software TensorRT8 will be available to all the NVIDIA developers at no extra cost. The new updated software has the capability to reduce latency to 1.2 ms and will also deliver two times more accuracy for INT8 precision with Quantization Aware Training.

NVIDIA’s vice president of developer programs, Greg Estes, said, “Artificial intelligence models are growing exponentially more complex, and worldwide demand is surging for real-time applications that use AI. That makes it imperative for enterprises to deploy state-of-the-art inferencing solutions.”

He further added that the latest version of the software would enable enterprises to develop highly responsive conversational artificial intelligence programs for their customers. NVIDIA’s software is being used by enterprises of various industries like manufacturing, automobile, healthcare, finance, and communication.

NVIDIA has optimized its transformers to enable TensorRT 8.0 to attain best-in-class speed. The company has also introduced a new Graphic Processing Unit architecture called Sparsity that reduces computational operation to increase the speed of neural networks.

Jeff Boudier, product director of Hugging Face, said, “We’re closely collaborating with NVIDIA to deliver the best possible performance for state-of-the-art models on NVIDIA GPUs.” he also mentioned that the company’s accelerated API delivers hundred times speedup for transformer models that use NVIDIA GPUs.
WeChat, one of China’s largest messaging platforms, used NVIDIA’s Tensor platform to develop its search engine that has a user base of more than 500 million users. WeChat officials said that they were able to achieve a reduction of 70% in computational resources using NVIDIA’s artificial intelligence-powered platform.

NVIDIA Launched TensorRT8 For Faster Artificial Intelligence Inference

LEAVE A REPLY Cancel reply

Most Popular

Unlocking the Power of Amazon Cloud Services: A Comprehensive Guide to Boost Your Business

Data Structures: A Beginner’s Guide to Organizing Information Efficiently

Unlocking Tomorrow: The Future of Artificial Intelligence and Its Impact on Our Lives

NVIDIA Launched TensorRT8 For Faster Artificial Intelligence Inference

Subscribe to our newsletter

RELATED ARTICLES

Grok 4: xAI’s Boldest AI Model Yet Brings Voice, Vision, and Reasoning to the Forefront

Perplexity’s Comet Browser Redefines AI-Powered Browsing with Agentic Search

Gemini Adds AI Magic: Turn Your Photos Into Videos with Google’s Latest Tool

LEAVE A REPLY Cancel reply

Most Popular

Unlocking the Power of Amazon Cloud Services: A Comprehensive Guide to Boost Your Business

Data Structures: A Beginner’s Guide to Organizing Information Efficiently

Unlocking Tomorrow: The Future of Artificial Intelligence and Its Impact on Our Lives