Alapfogalmak
There is a significant benefit in semi-supervised robust learning compared to the supervised setting, with the labeled sample complexity being sharply characterized by a different complexity measure (VCU) rather than the standard VC dimension.
Kivonat
The paper studies the problem of learning an adversarially robust predictor in the semi-supervised PAC model. The key findings are:
-
In the simple case where the support of the marginal distribution is known, the labeled sample complexity is Θ(VCU(H)/ε + log(1/δ)/ε).
-
In the general semi-supervised setting, the authors present a generic algorithm (GRASS) that can be applied to both realizable and agnostic settings.
For the realizable case:
- The labeled sample complexity is Õ(VCU(H)/ε + log(1/δ)/ε), which can be significantly smaller than the Ω(RSU(H)/ε + log(1/δ)/ε) lower bound for supervised robust learning.
- The unlabeled sample complexity matches the supervised robust learning bound of Õ(VC(H)VC*/ε + log(1/δ)/ε).
For the agnostic case:
- If an error of 3η + ε is allowed (where η is the minimal agnostic error), the labeled sample complexity is Õ(VCU(H)/ε^2 + log(1/δ)/ε^2).
- Obtaining an error of η + ε requires Ω(RSU(H)/ε^2 + log(1/δ)/ε^2) labeled examples, matching the supervised lower bound.
- The authors also show that for any γ > 0, there exists a hypothesis class where using only O(VCU) labeled examples leads to an error of (3/2 - γ)η + ε.
The results demonstrate that there can be a significant benefit in semi-supervised robust learning compared to the supervised setting, with the labeled sample complexity being controlled by the VCU dimension rather than the RSU dimension.