Unveiling the Secrets of CLIP's Data Curation: A Transparent Approach to High-Quality Language-Image Pretraining
The secret to the success of CLIP lies in its high-quality training data, which has been curated through a proprietary process. This work presents MetaCLIP, a transparent approach to data curation that outperforms CLIP's data on multiple benchmarks.