CVPR 2024  ·  pp. 12426–12434

Label-Efficient Group Robustness via Out-of-Distribution Concept Curation

Yiwei Yang, Anthony Z. Liu, Robert Wolfe, Aylin Caliskan, Bill Howe

University of Washington

CVPR Page PDF ← Back

TL;DR

We introduce Concept DRO, a framework that curates out-of-distribution concept images—including AI-generated ones—to estimate group labels and train distributionally robust models, achieving up to 33.1% improvement over baselines without requiring any explicit spurious attribute annotations.

Abstract

Deep neural networks capture spurious correlations between attributes and class labels, causing poor performance on certain subgroups—a problem that is especially harmful when those attributes represent protected classes. Prior solutions either require expensive per-example attribute labels or impose restrictive assumptions about which correlations are present.

We propose a framework that uses curated image concept sets to measure and control unwanted correlations without requiring pre-existing attribute labels. Our method, Concept DRO, uses these concept sets to estimate group labels and then applies a distributionally robust optimization (DRO) objective to train group-robust classifiers. We demonstrate that manually curated concept sets—including AI-generated images—effectively mitigate bias, and that even small concept sets are sufficient to improve worst-group accuracy by up to 33.1% over baseline methods on image classification benchmarks.

Method: Concept DRO

Step 1 — Concept Curation: Collect or generate a small set of images that capture the spurious attribute of interest (e.g., images depicting only the background cue, without the target class).

Step 2 — Group Label Estimation: Use the concept set to compute similarity scores between training examples and the spurious concept, producing soft group membership estimates.

Step 3 — Robust Training: Apply a DRO objective (e.g., JTT or GroupDRO) using the estimated group labels to upweight underrepresented subgroups during training.

Results at a Glance

+33.1% Worst-group accuracy gain over baseline
0 Spurious attribute labels required
Small Concept sets remain effective

Key Contributions

Citation

@InProceedings{Yang_2024_CVPR,
  author    = {Yang, Yiwei and Liu, Anthony Z. and Wolfe, Robert and
               Caliskan, Aylin and Howe, Bill},
  title     = {Label-Efficient Group Robustness via Out-of-Distribution
               Concept Curation},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer
               Vision and Pattern Recognition (CVPR)},
  month     = {June},
  year      = {2024},
  pages     = {12426--12434}
}