Proxy-RLHF: Decoupling Generation and Alignment in Large Language Models with Proxy
The author introduces the Proxy-RLHF method to separate generation and alignment tasks in Large Language Models, reducing computational costs while ensuring alignment with human values.