Core Concepts
Neural Slot Interpreter (NSI) bridges unsupervised slot representations with supervised object semantics, enhancing understanding and generative capabilities.
Abstract
Object-centric methods have advanced in decomposing scenes into slots for various tasks.
NSI introduces a program abstraction to associate neural embeddings with object slots.
The alignment model learns dense associations between object labels and slots.
The program generator decodes primitives from slots for downstream tasks like object detection.
Experiments show NSI's state-of-the-art alignment and generative capabilities across datasets.
Stats
NSI significantly outperforms prior works on bi-modal retrieval tasks.
NSI demonstrates improved performance on property prediction and object detection.
NSI aligns more than one object concept to a single slot in real-world scenarios.