The author introduces VideoMAC, a novel approach that combines video masked autoencoders with ConvNets to outperform ViT-based methods in downstream tasks.