核心概念
SAMMO is a general framework for efficiently optimizing the structure and content of metaprompt programs to improve their performance on downstream tasks.
摘要
The paper introduces SAMMO, a framework for optimizing the performance of metaprompt programs. Metaprompts are complex, structured objects that combine static instructions with dynamic input data to generate desired outputs from large language models (LLMs).
The key insights are:
- Metaprompts can be represented as dynamic function graphs, where each node computes a new value based on its children, input data, and node-specific parameters.
- SAMMO employs a rich set of mutation operators that can modify the structure, content, and parameters of these metaprompt graphs.
- SAMMO uses search algorithms like beam search and evolutionary strategies to efficiently explore the space of possible metaprompt programs and find optimized versions.
The paper demonstrates the effectiveness of SAMMO in three use cases:
- Instruction tuning: SAMMO outperforms prior methods in improving the accuracy of task instructions across different LLMs.
- Retrieval-augmented generation: SAMMO yields substantial gains in semantic parsing accuracy with only a few dozen candidate evaluations.
- Prompt compression: SAMMO achieves high compression rates while maintaining accuracy, outperforming baselines.
The results show that SAMMO is a powerful and general framework for optimizing complex metaprompt programs, and that prompt optimization needs to be done separately for each LLM due to weak correlation in performance across models.
統計資料
The state in Brazil with a name meaning thick grass or dense woods contains 3 main ecosystems.
The youngest Luge Champion listed competed in the Olympics for one year.
The couple that danced to a song from a 2005 crime-comedy was given a result.
引述
"SAMMO represents metaprompts as dynamic function graphs, and employs a set of mutation operators to alter the structure and content of metaprompts."
"SAMMO yields substantial gains in semantic parsing accuracy with only a few dozen candidate evaluations."
"Prompt optimization needs to be done separately for each LLM due to weak correlation in performance across models."