The PyTorch team announced several new updates along with the release of PyTorch 1.12, comprising 3124 commits and 433 contributors. These updates would include PyTorch Vision models on CPU, beta versions of AWS S3 integration, PyTorch on Intel Xeon Scalable processors with Bfloat 16, and FSDP API.
The update would enable functional APIs to apply module computation with given parameters. PyTorch introduced a new Beta release for TorchArrow, a library for ML preprocessing over batch data. Currently, it provides a Python-based DataFrame interface with a high-performing CPU backend.
PyTorch also introduced (Beta) Complex32 and Complex Convulsions. The library is compatible with complex numbers, complex modules, complex autograd, and numerous complex operations with Fast Fourier Transform (FFT) operators. In PyTorch, complex numbers are already used by several libraries, such as torchaudio and ESPN. PyTorch 1.12 now adds complex convolutions and the experimental complex32 data type, which allows for half-precision FFT computations.
The update also introduced a Beta version of Forward-Mode Automatic Differentiation that allows computations based on directional derivatives. The 1.12 version of PyTorch enhances the operator coverage for forward-mode AD.
Read More: New Android App Uses AI To Determine Coffee’s Roast Level
Besides API updates, there are several performance enhancements with nvFuser, a deep learning compiler. For Volta and subsequent CUDA accelerators, Torchscript is changing its default fuser in PyTorch 1.12 to nvFuser, which supports a wider variety of functions and is quicker than NNC, the prior fuser for CUDA devices.
When running vision models, memory formats have a significant impact on performance. 1.12 version explains the fundamentals of memory formats and shows how popular PyTorch vision models run faster with Channels Last on Intel® Xeon® Scalable processors. Especially with the latest precision numeric format like Bfloat 16, the performance is improved by many folds.
The new version also comes with Fully Sharded Data-Parallel (FSDP) API. PyTorch 1.11 had a prototype of the same with menial features. In this beta release for PyTorch 1.12. It adds:
Universal sharding strategy for API, mixed-precision policies, transformer auto wrapping policy, and faster model initialization.
For more details, you can check out the official release.