The paper examines the problem of accurately estimating the memory capacity (MC) of linear echo state networks (LESNs), a type of recurrent neural network. It is shown that numerical evaluations of MC reported in the literature often contradict the theoretical upper bound of N, where N is the dimension of the state space.
The authors first provide background on the definition of MC and its relation to the Kalman controllability matrix. They then demonstrate that linear models generically have maximal memory capacity, i.e., MC = N, as long as the reservoir matrix A and input mask C satisfy certain algebraic conditions.
However, the paper identifies two main issues with standard numerical approaches for estimating MC:
Monte Carlo estimation: The sample-based estimator of MC is shown to be positively biased, especially for large lags τ. This is due to the ill-conditioning of the covariance matrices involved in the computation, leading to inaccurate results that overestimate the true MC.
Naive algebraic estimation: Direct algebraic computation of MC based on the Kalman controllability matrix also suffers from numerical instabilities, resulting in underestimation of the true MC.
To address these challenges, the authors propose robust numerical methods that exploit the Krylov structure of the controllability matrix and the neutrality of MC to the choice of input mask. These techniques, called the orthogonalized subspace method and the averaged orthogonalized subspace method, are shown to accurately recover the theoretical MC of N for linear echo state networks.
The paper concludes that many previous efforts to optimize the memory capacity of linear recurrent networks were afflicted by numerical pathologies and conveyed misleading results.
翻译成其他语言
从原文生成
arxiv.org
更深入的查询