Core Concepts
Teaching large language models to use criteria for feedback generation is essential for improving task performance and aligning with human values.
Abstract
The content discusses a framework that enables large language models (LLMs) to use comprehensive criteria for providing natural language feedback on task execution. By extracting criteria from guidelines and creating in-context demonstrations, the framework aims to improve the quality of generated feedback across various writing tasks.
Humans follow criteria when executing tasks, which are used to assess task completion quality. Existing research often overlooks this aspect, leading to a proposal for a general framework that teaches LLMs to use criteria effectively. The study focuses on three real-world tasks: paper introduction writing, Python code writing, and Reddit post creation.
Challenges in teaching LLMs include implicit criteria in guidelines and potential misapplication due to expertise requirements. The proposed model-in-the-loop approach extracts criteria and constructs demonstrations from guidelines to teach LLMs effectively. Evaluation metrics like validity, contextualization, constructiveness, and helpfulness are used to assess the impact of incorporating criteria and demonstrations.
Experiment results show that adding criteria enhances the constructiveness of feedback texts but may reduce helpfulness in some cases. Providing both criteria and demonstrations yields mixed results compared to using only one approach. Overall, teaching LLMs to use criteria can lead to more insightful critiques and suggestions in generated feedback.
The study also explores different teaching strategies with various LLMs across different writing tasks, highlighting the importance of incorporating comprehensive criteria for effective feedback generation.
Stats
We propose a new framework LLMCRIT for obtaining scalable oversight that takes advantage of criteria.
Experiment results suggest that providing criteria allows the model to generate feedback that contains more critiques and suggestions.
We release 83 criteria and 332 in-context demonstrations collected for three real-world writing tasks at https://github.com/yyy-Apple/LLMCrit.