Core Concepts
Large language models (LLMs) are not yet superior to existing methods for complex word identification (CWI) and lexical complexity prediction (LCP), despite their versatility in other NLP tasks.
Stats
The average complexity score for single-word expressions in the CompLex LCP 2021 dataset is 0.3.
The average complexity score for multi-word expressions in the CompLex LCP 2021 dataset is 0.42.
Fine-tuned ChatGPT-3.5-turbo achieved an F1-score of over 80% on the CWI 2018 English dataset.
Llama-3-8b-ft and Vicuna-v1.5-13b-ft surpassed ChatGPT-3.5-turbo-ft's F1-score by 1-2% on the English-News and English-Wikipedia datasets, respectively.