Core Concepts
Layerwise Importance Sampled AdamW (LISA) is a memory-efficient alternative to LoRA for large language model fine-tuning.
Abstract
The article introduces Layerwise Importance Sampled AdamW (LISA) as a memory-efficient method for large language model fine-tuning. It addresses the memory consumption issue in large-scale training and outperforms LoRA in various settings. The content covers the motivation, method, experimental results, ablation studies, and theoretical properties of LISA.
Stats
LISA surpasses LoRA by over 11%-37% in MT-Bench scores.
LISA achieves on-par or better performance than LoRA on large models.
LISA provides almost 2.9× speedup compared to full-parameter tuning.
Quotes
"LISA outperforms both LoRA and full parameter training in a wide range of settings with memory costs as low as LoRA."
"LISA consistently outperforms LoRA by over 11%-37% in terms of MT-Bench scores."