Analyzing Few-Shot Adaptation of Large Vision-Language Models
State-of-the-art ETL approaches exhibit strong performance only in narrowly-defined experimental setups, requiring careful hyperparameter adjustments, while CLAP offers a more efficient and realistic alternative.