Masked Token Modeling Improves Storage-efficient Training of Vision Transformers
Masked Token Modeling (MTM) can improve the storage efficiency of token-based vision model training by leveraging self-supervised pre-training, while TokenAdapt and ColorAdapt enhance the effectiveness of token-based data augmentation.