Training-Free Open-Vocabulary Semantic Segmentation with Diffusion Models
FreeSeg-Diff, a zero-shot approach for image segmentation, leverages internal representations of text-to-image diffusion models to find class-agnostic masks that are then mapped to an open-ended list of object classes without any training or annotated masks.