SALMON: A Novel Approach to Align Large Language Models with Minimal Human Supervision
SALMON introduces an instructable reward model that can generate reward scores based on arbitrary human-defined principles, enabling the alignment of large language models with minimal human supervision.