Meta has announced Ego-Exo4D, a foundational dataset for advanced video learning and multimodal perception research. The dataset encompasses video content from six countries and seven U.S. states, providing a diverse resource for AI development.
Ego-Exo4D is a collaborative venture between Meta’s Fair, Meta’s Project Aria, and 15 universities. This dataset captures both first-person egocentric views from wearable cameras worn by participants and multiple “exocentric” views from surrounding cameras.
Ego-Exo4D is the largest public dataset of time-synchronized first- and third-person video. The comprehensive AI dataset features real-world experts showcasing specific skills, ranging from professional athletes and dancers to chefs and bike technicians. Note that the dataset isn’t available for download now but will be released by the end of December 2023.
Read More: Meta Announces Tools to Advance Socially Intelligent Robots
One impactful application of Ego-Exo4D lies in Augmented Reality (AR), where individuals using smart glasses might learn new skills with virtual AI coaching through instructional videos. Additionally, this dataset holds promise for robots to acquire dexterous manipulation skills by observing human actions.
As mentioned above, Ego-Exo4D isn’t just multiview; it’s multimodal. Using Meta’s Aria glasses, the dataset captures ego videos paired with seven-channel audio, IMUs, grayscale cameras, and various sensors. Additionally, sequences offer eye gaze, head poses, and 3D point cloud data through Project Aria’s cutting-edge machine perception services.