แนวคิดหลัก
A bias mitigation method based on multi-task learning, utilizing Monte-Carlo dropout and Pareto optimality, that optimizes accuracy and fairness while improving model explainability without using sensitive information.
บทคัดย่อ
The paper addresses the need for generalizable bias mitigation techniques in machine learning due to growing concerns of fairness and discrimination in data-driven decision-making. While existing methods have succeeded in specific cases, they often lack generalizability and cannot be easily applied to different data types or models. Additionally, the trade-off between accuracy and fairness remains a fundamental tension.
To address these issues, the authors propose a bias mitigation method based on multi-task learning, utilizing Monte-Carlo dropout and Pareto optimality. This method optimizes accuracy and fairness while improving the model's explainability without using sensitive information.
The authors test the method on three datasets from different domains (in-hospital mortality, finance, and stress prediction) and show how it can deliver the most desired trade-off between model fairness and performance. This allows for tuning in specific domains where one metric may be more important than another.
The key highlights of the proposed method are:
- Utilizes multi-task learning to predict the target label and a protected label
- Employs Monte-Carlo dropout to estimate model uncertainty, which is hypothesized to correlate with reduced bias
- Implements non-dominated sorting to obtain the Pareto optimal set of models that balance performance and fairness
- Demonstrates improved fairness metrics (disparate impact ratio, difference in false negatives/positives) compared to baseline and reweighing techniques
- Maintains performance while enhancing fairness, allowing for tuning based on domain-specific priorities
- Provides a generalizable framework to address bias mitigation and the fairness-performance trade-off in machine learning
สถิติ
The ADULT dataset shows the baseline model had a disparate impact ratio (DIR) above 2 for age, below 0.1 for sex, and around 0.5 for race.
The MIMIC-III dataset showed the baseline model was only biased by marital status, with a DIR of 1.308.
The SNAPSHOT dataset showed the baseline model was only biased by race for the evening-sad-happy label, with a DIR of 0.78.
คำพูด
"Negative bias can be introduced into the machine pipeline in two main ways, through the data or the algorithm itself."
"Despite their seeming success in specific cases, there is a recurring trend of bias mitigation methods lacking generalizability."
"With the framework we introduce in this paper, we aim to enhance the fairness-performance trade-off and offer a solution to bias mitigation methods' generalizability issues in machine learning."