Core Concepts
YAYI-UIE is an end-to-end chat-enhanced instruction tuning framework that supports both Chinese and English for universal information extraction, leveraging dialogue data and comprehensive Chinese and English information extraction datasets to enhance the model's performance.
Abstract
The paper proposes YAYI-UIE, an end-to-end chat-enhanced instruction tuning framework for universal information extraction that supports both Chinese and English. The framework consists of two main steps:
Instruction Tuning for Chat:
The authors utilize dialogue data to fine-tune a base language model, obtaining a chat model with enhanced instruction-following abilities and common understanding of open-world languages.
Instruction Tuning for Information Extraction:
The authors construct the most comprehensive Chinese instruction tuning benchmark for universal information extraction, combining it with existing English datasets.
The chat model is further fine-tuned on this combined dataset to obtain the universal information extraction model, which can generate structured outputs for various information extraction tasks, including named entity recognition, relation extraction, and event extraction.
The experimental results demonstrate that YAYI-UIE achieves state-of-the-art performance on Chinese datasets while also maintaining strong performance on English datasets, under both supervised and zero-shot settings.
Stats
The proposed framework supports both Chinese and English information extraction tasks.
The authors construct the largest and most comprehensive Chinese instruction tuning benchmark for universal information extraction, covering 16 datasets across various domains.
The combined dataset used for instruction tuning includes both Chinese and English information extraction datasets.