Data-efficient Event Camera Pre-training via Disentangled Masked Modeling
The author presents a novel data-efficient voxel-based self-supervised learning method for event cameras, overcoming limitations of previous approaches by introducing semantic-uniform masking and decomposing the hybrid masked modeling process. This method enables faster convergence with minimal pre-training data.