Core Concepts
The core message of this article is to propose a principled method called Triple Component Matrix Factorization (TCMF) that can provably separate shared low-rank features, unique low-rank features, and sparse noise from noisy multivariate data, even when the number of parameters to estimate is approximately thrice the number of observations.
Abstract
The article introduces the problem of common and unique feature extraction from noisy data, where N observation matrices from N different and associated sources are corrupted by sparse and potentially gross noise. The authors propose an alternating minimization algorithm called Triple Component Matrix Factorization (TCMF) to recover the three components - shared low-rank features, unique low-rank features, and sparse noise - exactly.
The key highlights are:
The authors discover a set of identifiability conditions, including sparsity, incoherence, and misalignment, that are sufficient for the almost exact recovery of the three components.
TCMF is a constrained nonconvex nonsmooth optimization problem that leverages existing methods for separating common and unique components as subroutines. The bulk of the computation in TCMF can be distributed.
The authors provide a convergence guarantee for TCMF, showing that under the identifiability conditions, the algorithm converges linearly to the ground truth. This is achieved by representing the solution into a Taylor-like series, which allows bounding the estimation error at each iteration.
Numerical experiments in video segmentation and anomaly detection showcase the superior feature extraction abilities of TCMF compared to existing methods that do not account for sparse noise.