The author introduces Slot Attention with Image Augmentation (SlotAug) to enable interpretable controllability over object representations. The approach incorporates sustainability in controllable slots through Auxiliary Identity Manipulation and Slot Consistency Loss.
The author introduces the Neural Slot Interpreter (NSI) to bridge the gap between unsupervised slot representations and supervised object semantics, enhancing alignment and generative capabilities.
Jointly leveraging high-level semantics and low-level temporal correspondence enhances object-centric perception in videos.
Enhancing object-centric learning through reasoning modules improves perception and prediction abilities in machine learning systems.