Core Concepts
DoRA enhances fine-tuning efficiency by decomposing weights into magnitude and direction components, outperforming LoRA.
Abstract
新しい重み分解解析を導入し、LoRAを上回るDoRAが、学習容量とトレーニング安定性を向上させる。共通感覚推論や画像/ビデオテキスト理解などのタスクで優れたパフォーマンスを示す。モデルのコードとモデルは公開される予定。
Stats
DoRA consistently outperforms LoRA on various tasks, such as commonsense reasoning, visual instruction tuning, and image/video-text understanding.
DoRA improves accuracy by 3.4% on LLaMA-7B compared to LoRA.
DoRA reduces training memory usage by approximately 24.4% in LLaMA fine-tuning.
Quotes
Weight-Decomposed Low-Rank Adaptation (DoRA) enhances both the learning capacity and training stability of LoRA while avoiding any additional inference overhead.
Our analysis reveals that LoRA and FT exhibit markedly distinct patterns of updates, leading us to surmise that these variations mirror the learning capability of each method.
Inspired by our findings, we propose Weight-Decomposed Low-Rank Adaptation (DoRA), which begins by decomposing the pre-trained weight into its magnitude and directional components.