The ORBIT India Dataset: Understanding the Challenges of Collecting a Disability-First AI Dataset in Low-Resource Environments
- Gesu India ,
- Martin Grayson ,
- Cecily Morrison ,
- Daniela Massiceti ,
- Simon Robinson ,
- Jennifer Pearson ,
- Matt Jones
CHI Conference on Human Factors in Computing Systems |
Computer vision systems are increasingly used by blind individuals to navigate their lives, helping, for example, locate objects such as doors or chairs. Yet these recognition systems do not work for many personal objects a blind user might want to find, such as keys or a special notebook. In response, efforts created personalized recognition systems, where individuals train their phones to identify and locate things, like a coffee mug or white cane, using example images/videos. However, these tools are trained on data from high-resource contexts, not necessarily reflecting India’s material culture. This paper discusses the contribution of the ORBIT-India dataset, which extends these tools to the Indian context, home of the world’s largest blind population. The ORBIT-India dataset comprises 105,243 images from 587 videos, representing 76 unique objects.We use this experience to examine dataset collection practices translated from high-to low-resource settings, providing recommendations to support cross-geography dataset collection.