Core Concepts
LLMRefine, an inference-time optimization method, iteratively refines the output of large language models using a learned fine-grained feedback model to pinpoint defects and guide the refinement process.
Abstract
The paper proposes LLMRefine, an inference-time optimization method to improve the quality of text generated by large language models (LLMs). The key idea is to use a learned fine-grained feedback model to identify defects in the initial LLM output and guide an iterative refinement process.
The framework consists of three main components:
A generation model that produces an initial candidate output.
A feedback model that analyzes the output and provides fine-grained feedback on the location, type, and severity of errors.
A refinement model that uses the feedback to generate an improved output.
The authors experiment with different local search algorithms, including always accept, greedy uphill, and simulated annealing, to balance exploration of the search space and exploitation of the feedback to find the optimal refined output.
The authors evaluate LLMRefine on three text generation tasks: machine translation, long-form question answering, and topical summarization. They show that LLMRefine consistently outperforms baseline approaches that use coarser feedback, achieving improvements of up to 1.7 MetricX points on translation tasks, 8.1 ROUGE-L on ASQA, and 2.2 ROUGE-L on topical summarization. Human evaluation also demonstrates a significant preference for the output of LLMRefine over the baseline outputs.
Stats
A meal had been waiting for an hour and a half.
A meal waited an hour and a half.
I've waited one and half hours for one meal.
Quotes
"LLMRefine, an inference-time optimization method, iteratively refines the output of large language models using a learned fine-grained feedback model to pinpoint defects and guide the refinement process."
"Our experiments show that LLMRefine results in higher-quality text compared to baseline methods using other feedback (scalar or binary score) or other search techniques."