核心概念
Uni-RLHF introduces a comprehensive system tailored for reinforcement learning with diverse human feedback, aiming to bridge the gap in standardized annotation platforms and benchmarks.
統計資料
Uni-RLHF contains three packages: universal multi-feedback annotation platform, large-scale crowdsourced feedback datasets, and modular offline RLHF baselines.