Główne pojęcia
Entity linking systems heavily depend on pre-built candidate sets, which limits their general applicability. This study provides a unified evaluation framework to assess the performance of state-of-the-art entity linking methods with and without access to candidate sets.
Streszczenie
This paper presents a comprehensive evaluation of modern entity linking techniques using a unified black-box testing framework. The key findings are:
Entity linking systems are excessively dependent on pre-built candidate sets, which significantly boosts their performance. Without access to these candidate sets, most systems fail to produce useful results.
Generation-based entity linking models are more resilient to the absence of candidate sets compared to models relying on mention-entity similarity. However, even the generation-based models show a substantial drop in performance without candidate sets.
The paper introduces a novel evaluation setup that replaces the candidate sets with the entire in-domain entity vocabulary. This reveals the trade-off between less restrictive candidate sets, increased inference time, and memory footprint for some models.
An error analysis is conducted to understand the impact of candidate sets on different error categories, such as over-generation, under-generation, incorrect mention, and incorrect entity prediction.
The study highlights the need for entity linking systems to be less dependent on hand-crafted candidate sets to ensure robust, versatile, and accurate performance in real-world deployments.
Statystyki
Removing candidate sets can lead to a 60% or more decrease in precision and recall for some entity linking models.
The run time for some models increases by up to 90x when using the entire in-domain entity vocabulary instead of pre-built candidate sets.
Cytaty
"Our findings confirm that modern entity linking systems are excessively dependent on candidate sets."
"Candidate sets significantly enhance precision and recall. Without candidate sets, there is a substantial decrease in precision and recall, exceeding 60% for some models."