Robustness to Subpopulation Shift with Domain Label Noise via Regularized Annotation of Domains

Date

Authors

Stromberg, Nathan
Ayyagari, Rohan
Welfert, Monica
Koyejo, Sanmi
Nock, Richard
Sankar, Lalitha

Journal Title

Journal ISSN

Volume Title

Publisher

Access Statement

Research Projects

Organizational Units

Journal Issue

Abstract

Existing methods for last layer retraining that aim to optimize worst-group accuracy (WGA) rely heavily on well-annotated groups in the training data. We show, both in theory and practice, that annotation-based data augmentations using either downsampling or upweighting for WGA are susceptible to domain annotation noise. The WGA gap is exacerbated in highnoise regimes for models trained with vanilla empirical risk minimization (ERM). To this end, we introduce Regularized Annotation of Domains (RAD) to train robust last layer classifiers without needing explicit domain annotations. Our results show that RAD is competitive with other recently proposed domain annotation-free techniques. Most importantly, RAD outperforms state-of-the-art annotation-reliant methods even with only 5% noise in the training data for several publicly available datasets.

Description

Keywords

Citation

Source

Transactions on Machine Learning Research

Book Title

Entity type

Publication

Access Statement

License Rights

DOI

Restricted until