Pre-trained Vision-Language Models Encode Discoverable Visual Concepts
Pre-trained vision-language models can learn and encode a diverse set of generic and visually discriminative concepts that can be effectively utilized for interpretable and generalizable visual recognition.