Offline Supervised Learning and Online Direct Policy Optimization for Efficient Neural Network-Based Optimal Feedback Control
Offline supervised learning and online direct policy optimization are two prevalent approaches for training neural network-based optimal feedback controllers. Offline supervised learning leverages pre-computed open-loop optimal control solutions, while online direct policy optimization transforms the optimal control problem into a direct optimization problem. This work conducts a comparative study of the two approaches and proposes a unified training paradigm, Pre-Train and Fine-Tune, that combines their strengths to significantly enhance performance and robustness.