Core Concepts
Fine-tuning, despite initial skepticism, proves to be a viable method for model editing by optimizing conditional likelihood and augmenting data.
Abstract
In the realm of model editing, fine-tuning is often overlooked due to its perceived inefficiency compared to specialized methods. However, this study challenges that notion by proposing a modified fine-tuning approach. By optimizing conditional likelihood and incorporating random paraphrases and facts for data augmentation, the authors demonstrate that pure fine-tuning can match or even outperform specialized editors in certain scenarios. The experiments conducted on ZsRE and COUNTERFACT datasets showcase the effectiveness of this approach in improving edit scores. The study emphasizes the simplicity and adaptability of fine-tuning compared to more complex editing methods like MEND, ROME, and MEMIT. Through careful modifications and strategic data augmentation, fine-tuning emerges as a competitive solution for model editing tasks.
Stats
Fine-tuning can match or outperform specialized editors in mass-editing.
The training takes around 2-3 hours on 8 GPUs.
The total number of facts used for fine-tuning is 360,000.