Core Concepts
Generating human-readable sense definitions for word usage graphs enhances semantic analysis and aids in explainable semantic change modeling.
Abstract
The content discusses the generation of human-readable sense definitions for word usage graphs (WUGs) to improve semantic analysis. It covers the importance of sense definitions in NLP tasks, the existing resources for semantic change modeling, the methodology of generating definitions, data description, evaluation results, error analysis, and future work.
Introduction and Related Work
- Most words are polysemous in natural languages.
- NLP tasks like word sense induction and disambiguation rely on valuable resources related to word senses.
- Semantic change modeling heavily uses Word Usage Graphs (WUGs) annotated with semantic proximity judgments.
- Existing WUGs lack human-readable sense labels for clusters.
Data Description
- Word usage graphs for English, German, Norwegian, and Russian are enriched with human-readable sense definitions.
- Definition datasets are fine-tuned on various resources for different languages.
Definition Generators
- Three methods are presented: Lesk, GlossReader, and DefGen.
- DefGen outperforms baselines in accuracy for generating human-readable definitions.
Evaluation and Results
- Human evaluation shows DefGen's superiority in generating accurate cluster labels.
- Error analysis reveals common mistakes in generated definitions.
Conclusion
- Generating human-readable sense definitions enhances semantic analysis and semantic change modeling.
- DefGen method shows promise for generating definitions across languages.
- Future work includes multilingual fine-tuning and expanding to more languages.
Stats
The conducted human evaluation has shown that these definitions match the existing clusters in WUGs better than the definitions chosen from WordNet by two baseline systems.
Table 1 provides the main statistics of the word usage graphs we employ.
Table 2 shows the statistics of the definition datasets for fine-tuning.
Table 3 shows the performance of mT0-based definition generators on the validation sets.
Table 4 displays the results of human evaluation in the 'guess the cluster by definition' task.
Tables 5, 6, and 7 categorize erroneous definitions by error type.
Table 8 provides examples of good definition instances.
Table 9 showcases borderline definition examples.
Table 10 presents examples of bad definition instances.
Quotes
"The resulting enriched datasets can be extremely helpful for moving on to explainable semantic change modeling."
"Our contribution is twofold: making existing word usage graphs more useful and evaluating definition generation methods on multiple languages."
"Generated definitions can be used as convenient and interpretable contextualized representations for various NLP tasks."