Exploring the Potential of GPT-4V, a VQA-Oriented Large Multimodal Model, for Zero-Shot Anomaly Detection
This paper explores the potential of the VQA-oriented GPT-4V model in the zero-shot anomaly detection task, proposing a framework that includes Granular Region Division, Prompt Designing, and Text2Segmentation to leverage GPT-4V's visual grounding capabilities.