The paper presents KGFiller, a framework for semi-automatic ontology population that leverages large language models (LLMs) as oracles. The key idea is to exploit the substantial amount of domain-specific knowledge encapsulated in LLMs, which are trained on large web corpora, to automatically generate instances for ontology classes and properties.
The framework consists of four main phases:
Population phase: Generates novel individuals for each class in the ontology by querying the LLM with class-specific templates.
Relation phase: Populates relationships between individuals by querying the LLM with property-specific templates.
Redistribution phase: Redistributes the generated individuals to the most specific classes available in the ontology hierarchy.
Merge phase: Identifies and merges semantically similar individuals that were generated during the previous phases.
The authors formalize the KGFiller framework, provide a Python implementation, and validate it through a case study in the nutritional domain. They compare the quality of ontologies populated using different LLM models, and provide a SWOT analysis of the proposed approach.
The key benefits of KGFiller include its domain-independence, ability to handle incomplete or biased data, and potential to significantly reduce the manual effort required for ontology population, while still allowing human experts to refine the results.
A otro idioma
del contenido fuente
arxiv.org
Ideas clave extraídas de
by Giovanni Cia... a las arxiv.org 04-08-2024
https://arxiv.org/pdf/2404.04108.pdfConsultas más profundas