Microsoft published a study, ‘Fake it till you make it: face analysis with synthetic data alone.‘ The research demonstrates that it is possible to use synthetic data for training facial analysis algorithms before using them in real-life scenarios. According to the software giant, the face biometrics scientific community has already been employing synthesizing training data with graphics for a long time.
However, the paper argues for a new method to bridge the domain gap between the real and synthetic applications when considering human faces. Microsoft’s new research synthesizes data by combining a procedurally-generated parametric 3D face model with a comprehensive library of hand-crafted assets to circumvent the issue. These assets render diverse training images and high realism.
Traditionally, researchers have employed a combination of data mixing, domain adaptation, and domain-adversarial training. The new process combines synthetic data with hand-crafted assets that generate rich labels that are otherwise impossible to label by hand. Researchers also have complete control over variation and diversity in a data set.
Read more: Byju’s launches innovation hub: Byju’s Lab
The procedurally constructed synthetic 3D faces are based on an initial face template. They are realistic and expressive and then scrambled with random expressions and textures. The researchers administered a training dataset of 100,000 synthetic face images, then evaluated the synthetic data on face pausing, face analysis tasks, and landmark localization.
The trained networks never saw a single real image, and researchers used label adaptation to minimize human-annotated labels. According to the Microsoft team, the major difficulty was converting the jawline in 3D facial images into a 2D face outline.