Prompt tuning can lead to misalignment in vision-language models, but feature shift consistency can help maintain alignment and improve generalization.