A novel cost-based approach to effectively adapt the CLIP vision-language model for open-vocabulary semantic segmentation, achieving state-of-the-art performance.