ALARM introduces a framework for aligning large language models with human preferences through hierarchical rewards modeling.