핵심 개념
Enhancing arithmetic reasoning in Large Language Models through query-dependent prompt optimization using Offline Inverse RL.
통계
"Our method optimizes prompt during inference on a query-dependent level effectively and cost-efficiently."
"The optimal prompt is chosen without LLM interaction, ensuring only the chosen prompt undergoes inference."
"Prompt-OIRL utilizes the offline reward model to pinpoint the most suitable prompt."
인용구
"Our method optimizes prompt during inference on a query-dependent level effectively and cost-efficiently."