Core Concepts
AutoMix optimizes computational cost and performance by strategically routing queries to larger language models based on self-verification.
Abstract
Large language models (LLMs) offer diverse options but optimizing cost and performance is challenging.
AutoMix uses self-verification and meta-verifier to enhance decision-making.
Three steps in AutoMix: solution generation, self-verification, selective routing.
Self-verification as entailment problem; meta-verifier refines verification accuracy.
Contributions include introducing AutoMix, exploring context-grounded entailment, proposing a POMDP-based meta-verifier, and introducing the IBC metric.
Experiments show up to 86% efficiency improvement over baselines across five datasets.
Stats
Our experiments using LLAMA2-13/GPT-4 demonstrate that AutoMix surpasses established baselines, improving the incremental benefit per cost by up to 86%.
Quotes
"Large language models are now available from cloud API providers in various sizes and configurations."
"Our experiments using LLAMA2-13/GPT-4 demonstrate that AutoMix surpasses established baselines."