Language Processing

Logga in

insikt - Language Processing

Limitations of Large Language Models in Handling Simple Linguistic Inferences

Large language models exhibit significant limitations in handling simple linguistic inferences that are trivial for humans, including grammatically-specified entailments, monotonicity entailments, and inferences involving evidential adverbs of uncertainty.

Event Extraction in Basque: A Typologically Motivated Cross-Lingual Transfer-Learning Analysis

The study explores the impact of linguistic typology on the performance of cross-lingual transfer learning for event extraction tasks, using Basque as the target language.

Sarcasm Detection Models Struggle to Generalize Across Diverse Datasets

Sarcasm detection models fine-tuned on specific datasets struggle to generalize to other datasets, highlighting the need for more diverse and representative sarcasm data to build robust sarcasm detection systems.

Efficient Generation and Evaluation of High-Quality Dictionary Example Sentences

Foundational models can be used to generate dictionary example sentences that outperform existing expert-curated examples, by leveraging a novel method to identify sentences that best exemplify the meaning of words.

Sinhala Offensive Language Dataset (SOLD): Annotated Dataset for Detecting Offensive Content in Sinhala

This paper introduces the Sinhala Offensive Language Dataset (SOLD), the largest annotated dataset for detecting offensive content in the Sinhala language. The dataset contains 10,000 tweets annotated at both the sentence and token level, enabling the development of explainable models for offensive language identification.

MaiBaam Annotation Guidelines Overview

Annotation guidelines for MaiBaam corpus.

Pre-Trained Models' Handling of Code-Switched Text: Insights and Findings

Pre-trained language models effectively generalize to code-switched text, revealing insights into their capabilities.

VLSP 2023 - LTER: Legal Textual Entailment Recognition Challenge Summary

법적 텍스트 함의 인식에 대한 VLSP 2023 - LTER 챌린지 요약

Enhancing Large Language Models with Domain-Specific Models: BLADE Study

BLADE introduces a novel framework to enhance large language models with domain-specific models, significantly improving performance in vertical domains.

Chinese Offensive Language Detection: Challenges and Solutions

Developing effective systems for detecting offensive language in Chinese poses unique challenges due to cultural nuances and linguistic complexities.

Produkter

Resurser