The paper proposes a novel family of Bellman mappings (B-Maps) defined in RKHSs to take advantage of the rich approximating properties of RKHSs and the flexibility an RKHS inner product brings into the design of loss functions and constraints. The proposed B-Maps possess ample degrees of freedom, and by appropriately designing their free parameters, several popular B-Map designs are shown to fall as special cases.
The key highlights and insights are:
The proposed B-Maps are nonparametric, with no need for statistical priors and assumptions on the data, to reduce the bias inflicted on data modeling. To address the "curse of dimensionality" issue, a dimensionality-reduction strategy based on random Fourier features is offered.
The B-Maps allow for sampling on-the-fly, do not require any knowledge on transition probabilities of Markov decision processes, and enable computationally lightweight operations to fit into the online or time-adaptive learning required by the adaptive filtering problem.
For the first time in the literature, the paper offers an RL-based solution to the problem of countering outliers in adaptive filtering. The proposed solution, built on a continuous state space and a discrete action space, adopts the well-known policy-iteration strategy and defines a quadratic loss on the Q-functions via the proposed B-Maps.
Theoretical properties of the proposed B-Maps, such as Lipschitz continuity and consistency of their fixed points, are established. A performance analysis of the proposed RL algorithm is provided, and numerical tests on synthetic data demonstrate its superior performance over several RL and non-RL schemes.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Yuki Akiyama... at arxiv.org 04-01-2024
https://arxiv.org/pdf/2403.20020.pdfDeeper Inquiries