Methods for Measuring Geodiversity in Large Overhead Imagery Datasets

This research introduces some of the first geo-computational methods to address a key gap in the artificial intelligence (AI) and big data literature as it relates to the geosciences and remote sensing: the lack of understanding of the global feature representativeness of labels in large remotely-sensed imagery (RSI) datasets. Issues of data fairness, heterogeneity and equitability – often related directly to geographic and demographic under-sampling – have recently come to the fore in multidisciplinary discussions of the ethics of AI. The risks of perpetuating data and models with unknown biases are particularly heightened in the air-and space-born RSI domain, given the explosive growth of RSI training datasets and their use by academia, industry and government for overhead object detection and classification tasks. To mitigate against this, we have developed spatial dataset analysis tools, inspired by advances in deep generative modeling (i.e., GANs), to measure several dimensions of RSI object class geodiversity and detect issues of image feature bias/homogeneity in the growing corpus of RSI datasets.

Pleas contact Robert Sanders ( for Zoom information.