Core Concepts
This paper presents new theoretical results that bound the generalization gap of neural belief propagation (NBP) decoders, which is the difference between the empirical and expected bit-error-rates. The bounds demonstrate the dependence of the generalization gap on the decoder complexity, code parameters, decoding iterations, and the training dataset size.
Abstract
The paper investigates the generalization capabilities of neural belief propagation (NBP) decoders, which are a class of deep learning-based decoders that unfold the belief propagation (BP) algorithm into a neural network architecture. The key contributions are:
The authors derive a general upper bound on the generalization gap of a deep learning decoder as a function of its bit-wise Rademacher complexity.
For the specific case of NBP decoders, the authors upper bound the bit-wise Rademacher complexity in terms of the covering number of the NBP decoder, which is the cardinality of the set of all decoders that can closely approximate the NBP decoder. This provides an upper bound with a linear dependence on the spectral norm of the weight matrices and polynomial dependence on the decoding iterations, which is tighter than bounds obtained using VC-dimension or PAC-Bayes approaches.
The authors derive upper bounds on the covering number of NBP decoders for both regular and irregular parity check matrices. These bounds show that the generalization gap scales with the inverse square root of the dataset size, linearly with the variable node degree and decoding iterations, and the square root of the blocklength.
Experimental results are presented to validate the theoretical findings, demonstrating the dependence of the generalization gap on the decoding iterations and training dataset size for Tanner codes, as well as the dependence on the blocklength for punctured QC-LDPC codes.
Stats
The paper does not provide any specific numerical data or statistics. It focuses on deriving theoretical bounds on the generalization gap of NBP decoders.
Quotes
The paper does not contain any striking quotes that support the key arguments.