Semi-Supervised Fine-Tuning of Vision Foundation Models to Improve Performance on Downstream Tasks with Limited Labeled Data
A semi-supervised fine-tuning approach that leverages content-style decomposition within an information-theoretic framework to enhance the latent representations of pre-trained vision foundation models, aligning them more effectively with specific task objectives and addressing the problem of distribution shift.