Reading in the Wild Dataset
100 hours of egocentric RGB video, eye gaze, and head pose data of reading and non-reading activities collected in diverse and realistic scenarios.
100 hours of reading in-the-wild
A first-of-its-kind dataset Reading in the Wild is designed to help solve the task of reading recognition by wearable devices in diverse environments. Reading in the Wild is a large-scale multimodal dataset comprising 100 hours of reading and non-reading videos captured in diverse and realistic scenarios using Project Aria. The dataset features video, eye gaze, and head pose sensor outputs, created to help solve the task of reading recognition from wearable devices.
Real-world multimodal egocentric data for reading recognition in the wild
Diverse scenes and scenarios
Various Reading Modes
Our dataset includes multiple reading modes including:
Comprehensive Reading Materials
150+ Reading materials in different mediums and text lengths.
Annotations
The recordings were processed by Aria Machine Perception Service to obtain accurate 6 DoF device trajectory, semi-dense point clouds, and 3D eye gaze ray estimation with depth.
In addition, the annotation contains
The Reading in the Wild dataset contains two distinct subsets, one captured in Columbus and the other captured in Seattle. The Seattle subset focuses on diversity, while the Columbus subset aims to test our model’s ability to generalize in unseen settings, as well as identify edge cases where the model fails.
Seattle subset
The Seattle subset emphasizes diversity, comprising 80 hours of data from 80 participants engaged in various reading and non-reading activities across multiple indoor and outdoor settings.
Columbus subset
The Columbus subset is designed to evaluate edge cases, comprising approximately 20 hours of data from 31 subjects containing reading and non-reading activities in indoor settings. It also features 4 different languages.
Aria Dataset Explorer makes it easy to discover, visualize, filter, and download Project Aria public datasets from the Project Aria team, all in one place.
Read the Reading Recognition in the Wild Research Paper
For more information about our motivations, methods, and dataset details read our research paper on arXiv.
Access Reading in the Wild Dataset
If you are a researcher interested in recreating our method of scene reconstruction, access the Aria Scene Reconstruction Dataset to get started.
By submitting your email and accessing the Reading Recognition dataset, you agree to abide by the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY_NC 4.0) and Apache 2.0 License, and to receive emails in relation to the dataset.
Stay in the loop with the latest news from Project Aria.
By providing your email, you agree to receive marketing related electronic communications from Meta, including news, events, updates, and promotional emails related to Project Aria. You may withdraw your consent and unsubscribe from these at any time, for example, by clicking the unsubscribe link included on our emails. For more information about how Meta handles your data please read our Data Policy.
Sign up for our newsletter
By providing your email, you agree to receive marketing related electronic communications from Meta, including news, events, updates, and promotional emails related to Project Aria. You may withdraw your consent and unsubscribe from these at any time, for example, by clicking the unsubscribe link included on our emails.