แนวคิดหลัก
The quadratic prediction error method, also known as nonlinear least squares, can achieve optimal non-asymptotic rates of convergence for a wide range of time-varying parametric predictor models satisfying certain identifiability conditions.
บทคัดย่อ
The paper studies the quadratic prediction error method for a class of time-varying parametric predictor models satisfying an identifiability condition. While this method is known to asymptotically achieve optimal rates for a wide range of problems, there have been no non-asymptotic results matching these optimal rates outside of a select few, typically linear, model classes.
The key highlights and insights are:
The authors provide the first rate-optimal non-asymptotic analysis of the quadratic prediction error method for a general setting of nonlinearly parametrized model classes.
They show that their results can be applied to a particular class of identifiable AutoRegressive Moving Average (ARMA) models, resulting in the first optimal non-asymptotic rates for identification of ARMA models.
The authors leverage modern tools from learning with dependent data, such as the martingale offset complexity, to derive their non-asymptotic bounds.
The non-asymptotic rates match known asymptotics up to constant factors and higher-order terms, with the leading term decaying at the optimal rate of O(dθσ²/T), where dθ is the parameter dimension and σ² is the noise variance.
The burn-in time required for the optimal rates grows polynomially in various problem parameters, including the parameter dimension, noise bound, and dependency of the input process.
สถิติ
The paper does not contain any explicit numerical data or statistics. The analysis is focused on deriving theoretical non-asymptotic bounds for the quadratic prediction error method.
คำพูด
"While the asymptotic rates of prediction error methods are by now well understood—including optimal rates of convergence [1] as characterized by the Cramér-Rao Inequality—relatively less is known about their non-asymptotic counterparts."
"To provide some intuition, k above can be thought of as an analogue to the inverse stability margin of a linear system, and in fact, the blocking technique cannot be applied to marginally stable linear autoregressions."