The study focuses on the impact of neologisms on Large Language Models (LLMs). It explores temporal drift in LLMs due to emerging neologisms, leading to model degradation. The research introduces NEO-BENCH, a benchmark to evaluate LLMs' ability to handle neologisms across various tasks. Results show that translating neologisms poses a challenge for models, with significant differences in performance based on linguistic origins of words. Older LLMs perform worse compared to newer models, highlighting the importance of adapting to evolving language changes. The study also analyzes perplexity rankings and downstream task performance for different types of neologisms - lexical, morphological, and semantic.
다른 언어로
소스 콘텐츠 기반
arxiv.org
더 깊은 질문