The paper introduces PROMST, a framework for optimizing prompts for large language models (LLMs) in complex multi-step tasks. The key insights are:
Prompt optimization for multi-step tasks is challenging due to the complexity of prompts, difficulty in evaluating individual actions, and varied human preferences.
PROMST addresses these challenges by:
Experiments on 11 diverse multi-step tasks show that PROMST outperforms several state-of-the-art prompt optimization methods by 10.6%-29.3% on average across different LLMs.
The paper also demonstrates that the optimized prompts can generalize to different LLM types, though each LLM performs best with prompts optimized for it specifically.
The learned score prediction model is shown to be effective at filtering out low-performing prompt candidates, improving the overall optimization efficiency.
Ablation studies confirm the importance of both the human feedback rules and the score prediction model in PROMST's superior performance.
The paper also explores how modifying the task score function can help align the optimized prompts with human preferences.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Yongchao Che... at arxiv.org 04-18-2024
https://arxiv.org/pdf/2402.08702.pdfDeeper Inquiries