Supervised fine-tuning can effectively enhance the generalization capabilities of vision foundation models after pretraining.