Unsupervised Learning of Versatile Visual Representations from Video through Feature Prediction
Feature prediction can serve as an effective stand-alone objective for unsupervised learning of versatile visual representations from video, outperforming pixel-level reconstruction approaches.