Google introduces one of the first large-scale architectures, LIMoE, to process text and images for Pathways.

Google AI launched a breakthrough new technology called LIMoE as a step towards its far-reaching goal of an AI architecture known as Pathways. Pathways is a single model AI architecture that can accomplish multiple tasks. It encapsulates Google’s research goal to use sparse models that handle text and images simultaneously. 

LIMoE stands for Learning Multiple Modalities with One Sparse Mixture-of-Experts Model. It is not the only architecture that can multi-task. Nevertheless, what sets it apart from others is the use of sparse models. Sparse models distinguish as one of the most promising future approaches to deep learning. They are different from the ‘dense’ models in that the sparse-model routes specific tasks to other “experts” in the network instead of using dependent computation instead of conditional computation. 

Google AI team has presented a sparse mixture of experts in “Multimodal Contrastive Learning with LIMoE: the Language Image Mixture of Experts.” the technology analyzes words and images simultaneously with sparsely activated experts. 

It outperforms other dense multimodal models and techniques in zero-shot image categorization. LIMoE can learn to handle a range of inputs and scale them up because of its sparsity. 

In the announcement, Google experts explained how LIMoE works. “There are also some clear qualitative patterns among the image experts — e.g., in most LIMoE models, there is an expert that processes all image patches that contain text. …one expert processes fauna and greenery, and another processes human hands.”

