The INSTRUCTIE dataset provides a comprehensive bilingual (Chinese and English) resource for training large language models to perform instruction-based information extraction tasks across diverse domains.
IEPILE is a comprehensive bilingual (English and Chinese) information extraction instruction corpus containing approximately 0.32B tokens, constructed by collecting and cleaning 33 existing datasets and introducing a schema-based instruction generation strategy to address the limitations of existing IE datasets.
The core message of this paper is that recognizing pivot elements, which simultaneously act as arguments of outer-nest events and as triggers of inner-nest events, is crucial for effectively extracting nested event structures.
IPED proposes an innovative approach for relational triple extraction using an implicit perspective and denoising diffusion model, achieving state-of-the-art performance and efficiency.
Mitigating definition bias in information extraction is crucial for improving model performance and alignment with annotations.
A unifying perspective on information extraction tasks, centered around spans in text.
Improving relation extraction performance with text representation learning.
Introducing the CARLG framework for improved event argument extraction by incorporating contextual clues and role correlations.
Large language models can efficiently extract structured data from diverse tables using schema-driven information extraction.
The author explores the challenges in structured entity extraction and introduces a novel approach using Large Language Models (LLMs) to enhance effectiveness and efficiency.