Exploring the Contributions of Language and Vision to Learning from Limited Data
Language models can maintain a majority of the performance of vision-language models in learning new visual tasks from limited data, suggesting language plays a key role in visual understanding.