The paper proposes a novel framework called "Incubator" to generate text classifiers based on user instructions. The key ideas are:
Instruction-Tuning: The authors collect instruction-data pairs from public classification datasets and use in-context learning (ICL) to fine-tune a large language model (LLM) as the "Incubator". This allows the Incubator to generate training data for text classifiers according to user-provided instructions.
Self-Diversification: To address the potential bias and lack of diversity in the generated data, the authors introduce a self-diversification technique. It utilizes a text embedder to identify semantically diverse samples and incorporates them into the instruction-tuning process.
The experiments demonstrate that the Incubator can:
The authors also provide comprehensive analyses on the efficiency, robustness, and scalability of the Incubator framework.
เป็นภาษาอื่น
จากเนื้อหาต้นฉบับ
arxiv.org
ข้อมูลเชิงลึกที่สำคัญจาก
by Letian Peng,... ที่ arxiv.org 04-18-2024
https://arxiv.org/pdf/2404.10877.pdfสอบถามเพิ่มเติม