A group of researchers from the University of Montreal, Renmin University of China, and Xidian University developed an extended version of the existing text generation package (TextBox 1.0), TextBox 2.0. TextBox 2.0 significantly improves pre-trained text generation models compared to its previous version. It is an up-to-date Python library based on PyTorch that builds a unified and standardized pipeline for applying pre-trained language models to text generation.
The TextBox 2.0 Python library uses over 45 pre-trained language models that cover 13 tasks and 83 datasets to implement a unified framework for researching text generation. Existing Python libraries fail to support the development of language models in a unified manner as they do not maintain a comprehensive evaluation pipeline for text generation enclosing data loading, training, and evaluation. This is because existing Python libraries are intended to handle a few generation tasks only.
The TextBox 2.0 library offers a standard method to compare various models and evaluate the generated text. It also provides four rich and influential training methodologies and four pretraining objectives to help users optimize pre-trained language models for text generation. For research purposes, users can either pre-train a brand-new model from scratch or improvise it. This will help them increase the efficiency and dependability of text generation model optimization.
Researchers have released the TextBox 2.0 library on GitHub, where users can quickly learn to install and use it.