핵심 개념
This paper proposes a new class of parameterized controllers that resemble a Quadratic Programming (QP) solver of a linear Model Predictive Control (MPC) problem, with the parameters of the controller being trained via Deep Reinforcement Learning (DRL) rather than derived from system models. This approach addresses the limitations of common controllers with Multi-Layer Perceptron (MLP) or other general neural network architecture used in DRL, in terms of verifiability and performance guarantees, while empirically matching MPC and MLP controllers in terms of control performance and exhibiting superior robustness and computational efficiency.
초록
The paper introduces a new class of parameterized controllers that draw inspiration from Model Predictive Control (MPC). The key idea is to design the controller to resemble a Quadratic Programming (QP) solver of a linear MPC problem, but with the parameters of the controller being trained via Deep Reinforcement Learning (DRL) rather than derived from system models.
The proposed approach addresses the limitations of common DRL controllers with Multi-Layer Perceptron (MLP) or other general neural network architectures, in terms of verifiability and performance guarantees. The learned controllers possess verifiable properties like persistent feasibility and asymptotic stability akin to MPC.
Numerical examples illustrate that the proposed controller empirically matches MPC and MLP controllers in terms of control performance, and has superior robustness against modeling uncertainty and noise. Furthermore, the proposed controller is significantly more computationally efficient compared to MPC and requires fewer parameters to learn than MLP controllers.
Real-world experiments on a vehicle drift maneuvering task demonstrate the potential of these controllers for robotics and other demanding control tasks, despite the high nonlinearity of the system.
The paper also provides theoretical analysis to establish performance guarantees of the learned QP controller, including sufficient conditions for persistent feasibility and asymptotic stability of the closed-loop system.
통계
The paper does not contain any explicit numerical data or statistics. The key results are presented through qualitative comparisons and empirical evaluations on benchmark systems and a real-world robotic task.
인용구
"Leveraging the fact that linear MPC solves a Quadratic Programming (QP) problem at each time step, we consider a parameterized class of controllers with QP structure similar to MPC."
"In contrast to most DRL-trained controllers, which often lack rigorous theoretical guarantees, our MPC-inspired controller is proven to enjoy verifiable properties like persistent feasibility and asymptotic stability."
"Lastly, though we only provide theoretical guarantees for controlling a linear system, the generalizability of the proposed constroller is empirically demonstrated via vehicle drift maneuvering, a challenging nonlinear robotics control task, indicating potential applications of our controller to real-world nonlinear robotic systems."