Masked Image Modeling as a Framework for Self-Supervised Learning of Visual Representations through Simulated Eye Movements
Masked image modeling (MIM) can serve as a framework for self-supervised learning of visual representations that aligns with the focused nature of biological perception through eye movements and attention shifts.