Core Concepts
To develop a model that generates informative and contextually relevant sentence-contexts for given keywords, benefiting various natural language understanding and generation applications.
Abstract
In the era of information abundance, providing users with contextually relevant and concise information is crucial. The Keyword in Context (KIC) generation task plays a vital role in applications like search engines, personal assistants, and content summarization. This paper presents a novel approach using the T5 transformer model to generate unambiguous and brief sentence-contexts for specific keywords by leveraging data from the Context-Reverso API. The study involves creating datasets, training models, and developing an application for learning new English words with generated contexts. By utilizing external resources like APIs, the work aims to address challenges in generating short contexts while mitigating ambiguity in sentence construction. The experiments involve fine-tuning pre-trained models like T5-small and T5-base on custom datasets to generate context sentences that incorporate given keywords meaningfully and unambiguously. Evaluation metrics such as BLEU and METEOR are used to assess the quality of generated text compared to reference text.
Stats
Our dataset contains diverse sentences incorporating target keywords.
T5-small has 60 million parameters.
T5-base has 220 million parameters.
GPT-2 has 117 million parameters.