Dual-Way Matching Enhanced Framework for Improving Multimodal Entity Linking Performance
The core message of this paper is to propose a novel Dual-Way Matching Enhanced (DWE+) framework that effectively leverages multimodal information, including text and images, to improve the performance of multimodal entity linking tasks. The key aspects of the framework include: 1) extracting fine-grained visual features and visual attributes to enhance the utilization of image information, 2) employing static and dynamic methods to enrich the semantics of entity representations, and 3) using hierarchical contrastive learning to align the overall and target-relevant multimodal features.