Bridging the Gaps: Learning Verifiable Model-Free Quadratic Programming Controllers Inspired by Model Predictive Control
This paper proposes a new class of parameterized controllers that resemble a Quadratic Programming (QP) solver of a linear Model Predictive Control (MPC) problem, with the parameters of the controller being trained via Deep Reinforcement Learning (DRL) rather than derived from system models. This approach addresses the limitations of common controllers with Multi-Layer Perceptron (MLP) or other general neural network architecture used in DRL, in terms of verifiability and performance guarantees, while empirically matching MPC and MLP controllers in terms of control performance and exhibiting superior robustness and computational efficiency.