Core Concepts
The authors propose novel algorithms for distributed estimation and learning that enable networked agents to collectively estimate unknown statistical properties of their privately observed samples while preserving the privacy of their signals and network neighborhoods.
Abstract
The paper introduces differentially private distributed estimation and learning algorithms that enable networked agents to collectively estimate the expected value of sufficient statistics from their privately observed samples, while preserving the privacy of their signals and network neighborhoods.
The key highlights and insights are:
The authors consider a network of n agents, where each agent observes i.i.d. samples from a common exponential family distribution with an unknown parameter θ. The agents aim to collectively estimate the expected value of the sufficient statistic, mθ = E[ξ(s)], by exchanging information with their neighbors.
The authors propose two privacy protection mechanisms: Signal DP, which protects the privacy of the agents' signals, and Network DP, which protects the privacy of the agents' signals and their local neighborhood estimates.
For the offline minimum variance unbiased estimation (MVUE) task, the authors show that the optimal noise distribution that minimizes the convergence time under ε-DP constraints is the Laplace distribution with parameters related to the global sensitivity of the sufficient statistic ξ(·) and the network structure.
For the online learning of the expected value mθ, the authors derive algorithms that achieve fast convergence rates while satisfying ε-DP constraints. The optimal noise distributions are again shown to be Laplace, with parameters depending on the global sensitivity and the network structure.
The authors provide theoretical guarantees on the total error, which is decomposed into the cost of privacy (due to the DP noise) and the cost of decentralization (due to the distributed nature of the problem).
Experiments on real-world power grid and household electricity consumption datasets demonstrate the effectiveness of the proposed algorithms in achieving (ε, δ)-DP while not significantly sacrificing convergence compared to non-private baselines.
Stats
∆ = maxs∈S |dξ(s)/ds| is the global sensitivity of the sufficient statistic ξ(·).
Mn = maxi∈[n] |ξ(si)| is the maximum absolute value of the sufficient statistics.
β⋆ = max{λ2(A), |λn(A)|} is the spectral gap of the doubly-stochastic adjacency matrix A.
a = maxi≠j aij is the maximum non-diagonal entry of the adjacency matrix A.
Quotes
"The noise that minimizes the convergence time to the best estimates is the Laplace noise, with parameters corresponding to each agent's sensitivity to their signal and network characteristics."
"Our algorithms are amenable to dynamic topologies and balancing privacy and accuracy trade-offs."