핵심 개념
A computationally efficient and differentially private algorithm for high-dimensional sparse model selection using the best subset selection approach.
초록
The paper considers the problem of model selection in a high-dimensional sparse linear regression model under privacy constraints. The authors propose a differentially private best subset selection (BSS) method with strong utility properties by adopting the well-known exponential mechanism for selecting the best model.
Key highlights:
- The authors establish that the proposed exponential mechanism enjoys polynomial mixing time to its stationary distribution and provides approximate differential privacy for the final estimates.
- They design an efficient Metropolis-Hastings algorithm that also enjoys desirable utility similar to the exponential mechanism under the differential privacy framework.
- The authors provide theoretical guarantees for the utility of the proposed methods, showing that accurate model recovery is possible under certain signal strength conditions.
- Illustrative experiments demonstrate the strong utility of the proposed algorithms compared to non-private best subset selection.
통계
There exist positive constants r and xmax such that supy∈Y |y| ≤ r and supx∈X ∥x∥∞ ≤ xmax.
The true parameter vector β satisfies ∥β∥1 ≤ bmax.
The design matrix X satisfies the Sparse Riesz Condition with positive constants κ- and κ+.
The true sparsity level s follows s ≤ n/(log p).
인용구
"We propose a differentially private best subset selection method with strong utility properties by adopting the well-known exponential mechanism for selecting the best model."
"We propose an efficient Metropolis-Hastings algorithm and establish that it enjoys polynomial mixing time to its stationary distribution."
"Furthermore, we also establish approximate differential privacy for the final estimates of the Metropolis-Hastings random walk using its mixing property."