The chapter begins by defining machine learning (ML) model robustness as the capacity to sustain stable predictive performance in the face of variations and changes in the input data. Robustness is distinguished from generalizability, as it focuses on maintaining performance under dynamic environmental conditions rather than just on novel but in-distribution data.
Robustness is identified as a core requirement for trustworthy AI systems, interacting with other key aspects like safety, fairness, and explainability. The chapter discusses the complementary roles of uncertainty quantification and out-of-distribution detection in enabling robust ML.
The content then delves into the key challenges impeding ML robustness, including data bias leading to train-serving skew, the double-edged sword of model complexity, and the underspecification of ML pipelines.
The robustness assessment techniques covered include adversarial attacks (white-box, black-box, and physical), non-adversarial data shifts, and DL software testing methodologies. Adversarial attacks aim to generate perturbations that can fool the model, while non-adversarial shifts simulate naturally occurring distribution changes. DL software testing approaches systematically generate synthetic inputs to reveal model brittleness.
Finally, the chapter explores amelioration strategies for bolstering robustness, such as data-centric approaches (debiasing, augmentation), model-centric methods (transfer learning, adversarial training), and post-training techniques (ensembling, pruning, model repairs). The chapter concludes by highlighting the ongoing challenges and limitations in estimating and achieving ML robustness.
翻譯成其他語言
從原文內容
arxiv.org
深入探究