Hyperbolic Entailment Filtering for Improving Image-Text Contrastive Learning and Image-Only Self-Supervised Learning
HYPE, a novel data filtering method, leverages hyperbolic embeddings and the concept of entailment cones to effectively extract modality-wise meaningful and well-aligned data from extensive, noisy image-text pair datasets, thereby enhancing the specificity and clarity of data semantics for improved model training.