Centrala begrepp
Entity6K introduces a diverse dataset for real-world entity recognition, addressing the lack of suitable evaluation datasets in open-domain settings.
Sammanfattning
Abstract:
Entity6K dataset introduced for real-world entity recognition.
Features 5,700 entities across 26 categories with human-verified images.
Introduction:
Recognizing entities from images is challenging due to visual complexity and open-domain nature.
Related Work:
Studies on open-domain entity recognition, zero-shot image classification, and object detection discussed.
Entity6K Dataset:
Data acquisition process explained with details on entity list compilation and image collection.
Human Annotation:
Bounding box and textual description annotation process outlined.
Statistics of the Dataset:
Comparison with existing datasets presented, highlighting the value of Entity6K.
Experimental Settings:
Tasks chosen include object detection, zero-shot classification, image captioning, and dense captioning.
Detailed Results for Each Category:
Image captioning results by OFA, BLIP, GRiT, and GIT provided along with object detection results for GLIP, GRiT, DINO, and ViT-Adapter.
Statistik
Entity6Kは、26のカテゴリにわたる5,700のエンティティを特徴とし、人間によって検証された画像を提供しています。