TL;DR
Abstract
Deep neural networks capture spurious correlations between attributes and class labels, causing poor performance on certain subgroups—a problem that is especially harmful when those attributes represent protected classes. Prior solutions either require expensive per-example attribute labels or impose restrictive assumptions about which correlations are present.
We propose a framework that uses curated image concept sets to measure and control unwanted correlations without requiring pre-existing attribute labels. Our method, Concept DRO, uses these concept sets to estimate group labels and then applies a distributionally robust optimization (DRO) objective to train group-robust classifiers. We demonstrate that manually curated concept sets—including AI-generated images—effectively mitigate bias, and that even small concept sets are sufficient to improve worst-group accuracy by up to 33.1% over baseline methods on image classification benchmarks.
Method: Concept DRO
Step 2 — Group Label Estimation: Use the concept set to compute similarity scores between training examples and the spurious concept, producing soft group membership estimates.
Step 3 — Robust Training: Apply a DRO objective (e.g., JTT or GroupDRO) using the estimated group labels to upweight underrepresented subgroups during training.
Results at a Glance
Key Contributions
- Label-free group robustness: A novel framework for controlling spurious correlations without requiring any pre-existing per-example attribute annotations.
- Concept DRO: A practical method that estimates group labels from curated concept sets and trains with a DRO objective, compatible with off-the-shelf robust training methods.
- AI-generated concepts work: Demonstrated that AI-generated images can serve as effective concept sets, enabling robustness improvement even when real reference images are unavailable.
- Label efficiency: Showed that small concept sets remain effective, significantly lowering the annotation and data collection cost for practitioners building equitable AI systems.
Citation
@InProceedings{Yang_2024_CVPR,
author = {Yang, Yiwei and Liu, Anthony Z. and Wolfe, Robert and
Caliskan, Aylin and Howe, Bill},
title = {Label-Efficient Group Robustness via Out-of-Distribution
Concept Curation},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2024},
pages = {12426--12434}
}