The paper presents KGFiller, a framework for semi-automatic ontology population that leverages large language models (LLMs) as oracles. The key idea is to exploit the substantial amount of domain-specific knowledge encapsulated in LLMs, which are trained on large web corpora, to automatically generate instances for ontology classes and properties.
The framework consists of four main phases:
Population phase: Generates novel individuals for each class in the ontology by querying the LLM with class-specific templates.
Relation phase: Populates relationships between individuals by querying the LLM with property-specific templates.
Redistribution phase: Redistributes the generated individuals to the most specific classes available in the ontology hierarchy.
Merge phase: Identifies and merges semantically similar individuals that were generated during the previous phases.
The authors formalize the KGFiller framework, provide a Python implementation, and validate it through a case study in the nutritional domain. They compare the quality of ontologies populated using different LLM models, and provide a SWOT analysis of the proposed approach.
The key benefits of KGFiller include its domain-independence, ability to handle incomplete or biased data, and potential to significantly reduce the manual effort required for ontology population, while still allowing human experts to refine the results.
To Another Language
from source content
arxiv.org
Дополнительные вопросы