RLIF combines reinforcement learning with interactive imitation learning to improve performance without requiring ground truth rewards.
RLIF proposes a method that combines reinforcement learning with interactive imitation learning, allowing for improved performance without requiring ground truth rewards.