Towards Secure Tuning: Mitigating Security Risks Arising from Benign Instruction Fine-Tuning
標題:邁向安全的微調:降低由良性指令微調產生的安全風險
作者:Yanrui Du, Sendong Zhao, Jiawei Cao, Ming Ma, Danyang Zhao, Fenglei Fan, Ting Liu, and Bing Qin
機構:Harbin Institute of Technology, Chinese University of Hong Kong