This technical report presents a comprehensive evaluation of 310 large language models (LLMs) fine-tuned using the Low Rank Adaptation (LoRA) method. Key findings include:
LoRA fine-tuning provides a consistent and significant boost in performance across 10 base models and 31 tasks. On average, fine-tuned models outperform their base counterparts by 38.7 points and GPT-4 by 9.5 points.
Mistral-7B and Zephyr-7B emerge as the most effective base models for LoRA fine-tuning, with the fine-tuned Mistral-7B model achieving the best performance on the most number of tasks.
While instruction-tuned models initially outperform auto-complete models, fine-tuning narrows this gap, with the best fine-tuned models from both categories achieving comparable performance.
Task complexity heuristics like input/output length, compressibility, and content diversity can reasonably predict the potential gains from LoRA fine-tuning, with linear models achieving low root mean squared errors.
The authors also introduce LoRAX, an open-source system for efficiently serving multiple LoRA-adapted LLMs on a single GPU, and demonstrate its capabilities through the LoRA Land web application.
翻譯成其他語言
從原文內容
arxiv.org
深入探究