Core Concepts
Test-Time Prototype Shifting (TPS) framework enhances zero-shot generalization in VLMs by modulating class prototypes directly in the embedding space.
Abstract
Advancements in vision-language models (VLMs) have improved computer vision, especially in zero-shot learning.
The Test-Time Prototype Shifting (TPS) framework adapts VLMs to test datasets using unlabeled inputs by modulating per-class prototypes.
TPS reduces memory and computational demands compared to traditional methods like Text-Prompt Tuning (TPT).
Extensive evaluations show TPS outperforms baselines on natural distribution shifts and cross-dataset generalization benchmarks.
TPS achieves state-of-the-art results while reducing resource requirements significantly.
Stats
テスト時のプロトタイプシフト(TPS)フレームワークは、ゼロショット学習においてクラスのプロトタイプを直接埋め込み空間で変調することで、VLMにおけるゼロショット汎化を向上させます。