Kernkonzepte
The core message of this paper is to propose a new framework called IRSS that can gradually separate style distribution and spurious features from images by introducing adversarial neural networks and multi-environment optimization, thus achieving out-of-distribution generalization without requiring additional supervision such as domain labels.
Zusammenfassung
This paper addresses the out-of-distribution (OOD) generalization problem under the setting that both style distribution shift and spurious features exist, and domain labels are missing.
The key insights are:
- The authors analyze public OOD benchmark datasets and identify two essential factors that lead to distribution shift: diverse styles and spurious features (i.e., objects outside the target class of interest).
- The authors propose a new Structural Causal Model (SCM) for image generation that explicitly captures style and spurious features, with no direct causal relationship between the image and the label, nor between the spurious feature and the style.
- Based on the proposed SCM, the authors develop a new framework called IRSS (Invariant Representation Learning via Decoupling Style and Spurious Features). IRSS aligns the style distribution and eliminates the influence of spurious features using two independent components, without requiring domain labels.
- IRSS outperforms traditional OOD methods and solves the problem of Invariant Risk Minimization (IRM) degradation, enabling the extraction of invariant features under distribution shift.
The authors conduct experiments on benchmark datasets PACS, OfficeHome, and NICO, demonstrating the effectiveness of IRSS in achieving out-of-distribution generalization.
Statistiken
The PACS dataset consists of four domains (Art Painting, Cartoon, Photo, and Sketch) with seven common categories.
The OfficeHome dataset consists of four domains (Art, Clipart, Product, and RealWorld) with 65 different categories.
The NICO dataset contains 19 classes with 9 or 10 different contexts (object poses, positions, backgrounds, movement patterns, etc.).
Zitate
"This form of distribution discrepancy in OOD is called style distribution shift."
"This form of distribution discrepancy in OOD is called spurious feature shift."