The paper presents a method to train neural network controllers with guaranteed stability margins, specifically the disk margin. The key points are:
The neural network controller is modeled as the interconnection of an LTI system and activation functions, which can capture a wide range of neural network architectures.
The disk margin is used to characterize the robustness of the closed-loop system to simultaneous gain and phase variations. This is more general than classical gain and phase margins.
Quadratic constraints are used to describe the uncertainty in the plant and the nonlinearity in the neural network activation functions.
A stability condition is derived in the form of a linear matrix inequality that certifies the disk margin for a given neural network controller.
An algorithm is presented that alternates between a reinforcement learning step to improve the controller's performance and a stability margin-enforcing step to project the controller into the set of controllers that satisfy the desired disk margin.
The method is demonstrated on a flexible rod on a cart example, where the neural network controller trained with the proposed approach achieves significantly higher reward than an LTI controller while guaranteeing the same stability margins.
Egy másik nyelvre
a forrásanyagból
arxiv.org
Mélyebb kérdések