Generalizable Open-Vocabulary Neural Semantic Fields for 2D and 3D Semantic Segmentation
GOV-NeSF, a novel approach that offers a generalizable implicit representation of 3D scenes with open-vocabulary semantics, exhibits state-of-the-art performance in both 2D and 3D open-vocabulary semantic segmentation without requiring ground truth semantic labels or depth priors, and effectively generalizes across scenes and datasets without fine-tuning.