Core Concepts
Algorithms for repairing unfairness in training data using optimal transport.
Abstract
The paper addresses the need for algorithms to repair unfairness in training data, focusing on conditional independence between protected attributes and features. It introduces a method using optimal transport (OT) for repairing archival data with a small proportion of labeled research data. Experimental results demonstrate effective repair of off-sample, labeled data.
Index Terms:
AI fairness
Optimal transport
Data repair
Conditional independence
Mixture modeling
Kernel density estimation
Sections:
Introduction
Importance of fairness in decision-making.
Fairness as Conditional Independence
Defining fairness and metrics for subgroup fairness.
Optimal Transport for Data Repair
Using OT to establish conditional independence between features and protected attributes.
Off-Sample Data Repair
Framework for repairing archival data using research data-trained OT repairs.
Simulation and Real-Data Studies
Validation of the method on simulated and real-world data sets.
Discussion
Considerations, assumptions, and future directions.
Stats
"nR ≡500 research (on-sample) points"
"nA ≡5000 archival (off-sample) points"
"nQ = 250 to ensure high resolution in interpolated supports"
Quotes
"We define U-conditional fairness as (X ⊥⊥S)|U."
"Disparate impact is often adopted as the proxy for quantifying the extent to which Definition 2.2 is met."