OVFoodSeg: Enhancing Open-Vocabulary Food Image Segmentation through Image-Informed Text Embeddings
OVFoodSeg, a novel framework, effectively integrates vision-language models with image-to-text learning and image-informed text encoding to address the challenges of large intra-class variance and limited ingredient coverage in open-vocabulary food image segmentation.