Temel Kavramlar
A novel probabilistic approach for enhancing the robustness of trigger set-based watermarking techniques against model stealing attacks.
Özet
The paper introduces a novel probabilistic approach for enhancing the robustness of trigger set-based watermarking techniques against model stealing attacks. The key idea is to compute a parametric set of proxy models that mimic the set of stolen models, and then verify the transferability of the trigger set to these proxy models. This ensures that the trigger set is transferable to the stolen models with high probability, even if the stolen model does not belong to the set of proxy models.
The authors first describe the process of computing the trigger set candidates as convex combinations of pairs of points from the hold-out dataset. Then, they introduce the parametric set of proxy models and the procedure of verifying the transferability of the trigger set to these proxy models. The authors provide probabilistic guarantees on the transferability of the trigger set from the source model to the stolen models.
The experimental results show that the proposed approach outperforms current state-of-the-art watermarking techniques in terms of the accuracy of the source model, the accuracy of the surrogate models on the trigger set, and the robustness to various model stealing attacks, including soft-label, hard-label, and regularization-based attacks.
The authors also discuss the integrity of the method, demonstrating that it can distinguish between stolen models and independent (not stolen) models. They provide experiments on the integrity of the method, showing that the least similar independent models are the ones trained on different datasets, regardless of architecture.
İstatistikler
The l2-norm of the difference between the parameters of the source model and the surrogate models is reported in Table 1. This shows that the surrogate models do not belong to the proxy set used in the proposed approach.
Alıntılar
"The key idea of our method is to compute the trigger set, which is transferable between the source model and the set of proxy models with a high probability."
"We analyze the probability that a given trigger set is transferable to the set of proxy models that mimic the stolen models."
"We experimentally show that, even if the stolen model does not belong to the set of proxy models, the trigger set is still transferable to the stolen model."