Abstract
Advances in computer vision driven by deep learning have greatly improved the performance of image algorithms on benchmark datasets and in domains like autonomous driving, but the lack of geodiversity in canonical datasets like ImageNet significantly limits the use of pretrained models in certain geographical regions, especially in the Global South. Initial approaches to remedy geographic bias have not explicitly charted a path forward on building more representative datasets. The release of the Massively Multilingual Image Dataset (MMID), created to study word translation via image similarity, presents a rich source of images searched with words in nearly 100 of the most spoken languages and their generally reliable English translations. We curate a dataset of images with people-related labels from languages spoken in India and validate our hypothesis that people images from English queries are insufficient to train models to classify people in non-English images. We also discuss scaling our methodology to the rest of MMID.
“Mitigating Geographic Bias of Image Classifiers with Multilingual Image Data”: paper, article
Presented at Bloomberg Data for Good Exchange 2019