Основні поняття
An effective approach to automatically select the optimal prompt for a given input from a finite set of synthetic candidate prompts, balancing prompt generality-specificity and eliminating the need for resource-intensive training and inference.
Анотація
The paper proposes an approach called Automatic Prompt Selection (APS) to automatically select the optimal prompt for a given input from a finite set of synthetic candidate prompts. The approach consists of three steps:
-
Prompt Database Generation:
- The training data is clustered into coherent groups.
- For each cluster, an LLM-based prompt generator is used to create a set of diverse prompts.
- The prompts from all clusters are combined into a versatile prompt database.
-
Prompt Evaluator Training:
- A dataset of input-prompt-output tuples is synthesized by querying a data generation LLM with the generated prompts and training inputs.
- A prompt evaluator is trained on this dataset using a preference loss, encouraging high scores for good prompts and low scores for bad ones.
-
Prompt Ranking:
- During inference, given a testing input, the highest-scoring prompt from the database is selected using the prompt evaluator.
- The selected prompt is then used with a downstream LLM to compute the final output.
The proposed method balances prompt generality-specificity and eliminates the need for resource-intensive training and inference, demonstrating competitive performance on zero-shot question-answering datasets: GSM8K, MultiArith, and AQuA.
Статистика
The dataset GSM8K consists of 7,473 training and 1,319 test problems that require between 2 and 8 steps to solve, involving basic arithmetic operations.
The dataset MultiArith has 420 training and 180 test problems that demand the application of multiple arithmetic operations and logical reasoning.
The dataset AQuA contains more than 100,000 algebraic word problems, with each sample having a question, options, rationale, and the correct option.
Цитати
"Large Language Models (LLMs) have emerged as a cornerstone in natural language processing (NLP), propelled by advancements in scaling techniques and attention mechanisms."
"Effective prompting reduces the necessity for resource-intensive fine-tuning on task-specific data, offering a cost-effective and time-saving alternative."
"One limitation of current approaches is that they struggle to strike a balance between prompt generality and specificity, either relying on a single prompt for all inputs (lacking flexibility) or generating a distinctive prompt per input (expanding the search space and potentially destabilizing the system)."