Facebook open-sourced a new python library named AugLy to assist artificial intelligence researchers in developing a more sturdy machine learning model using data augmentation. AugLy pitches in by providing advanced data argumentation tools that can be used to train and test various models.
As most of the data sets that are being used are multimodal, AugLy was fabricated to combine audios, texts, videos, and images in different modalities. It offers over 100 data augmentations that emphasize various things that real people on social media platforms like Facebook and Instagram do (like overlaying text with images, text with emoji, screenshots, etc.). Facebook says that many augmentations were informed by the various ways people transform infringing content to escape the automatic restricting systems.
AugLy has four sub-libraries with the same interface, each corresponding to various modalities. Provided with both function-based and class-based transforms along with intensity function to help users identify how intense a transformation is. The augmentations were sourced from multiple existing libraries as well as some developed by Facebook itself.
Read more: Facebook’s Artificial Intelligence Can Now Detect Deepfake
If the models can be robust enough to disrupt the unimportant aspects of data, it will learn to focus on the crucial data attributes for a specific use, says the Facebook AI blog. The blog also mentions that the model developed using AugLy can detect duplicate copies or almost identical copies of a particular infringing content even when the image is augmented by a pixel or with a filter or with text or audio added. This actively prevents users from uploading disturbing content.
AugLy can assist with object detection models, identification of hate speech, voice reorganization. It was used in the Deepfake detection challenge to check the robustness of deepfake detection models. AugLy is part of Facebook AI’s broader efforts on advancing multimodal machine learning, ranging from the hateful memes challenge to SIMCC data sets for training next-generation shopping assistants, mentioned by Facebook in their AI blog.
Check Facebook AugLy library on GitHub.