Core Concepts
A control-theoretic approach to efficiently adapt a trained model to new data without forgetting the previously learned samples.
Abstract
The paper introduces a novel control-theoretic approach to fine-tuning and transfer learning for supervised learning tasks. The key contributions are:
Formulation of the fine-tuning problem as a control problem of steering an ensemble of points (training samples) to their corresponding labels. This allows leveraging control theory concepts and techniques.
Development of an iterative algorithm called "Tuning without Forgetting" that can efficiently adapt a trained control function (model) to an expanded training set. The algorithm ensures the model retains performance on previously learned samples while learning the new ones.
Theoretical analysis showing the proposed algorithm satisfies the "tuning without forgetting" property up to the first order. This means the algorithm can update the control function to steer new samples to their targets without significantly changing the mapping for the previously learned samples.
Comparison to the existing "M-folded" method, which scales quadratically with the training set size. The proposed approach has a linear complexity, making it more scalable.
Numerical experiments demonstrating the effectiveness of the control-theoretic fine-tuning approach compared to a penalty-based fine-tuning method from the literature.
The paper presents a novel control-theoretic perspective on the important problems of fine-tuning and transfer learning, offering an efficient and scalable solution with theoretical guarantees.