Kernkonzepte
Generalized Occupancy Models (GOMs) enable quick adaptation to new tasks by modeling all possible outcomes in a reward-agnostic and policy-agnostic manner, avoiding compounding errors in model-based RL.
Zusammenfassung
Agents must be generalists, adapting to varying tasks.
Model-based RL suffers from compounding errors.
GOMs model all possible outcomes, enabling transferability.
GOMs avoid the challenges of compounding error and can adapt to arbitrary rewards.
GOMs show superior transfer performance compared to MBRL, successor features, and goal-conditioned RL.
GOMs can solve non-goal conditioned tasks with human preferences.
GOMs demonstrate the ability to perform trajectory stitching, combining suboptimal trajectories.
Statistiken
GOMs bauen auf dem Konzept der Generalisierung von Modellen auf, um schnelle Anpassung an neue Aufgaben zu ermöglichen.
Zitate
"Generalized Occupancy Models retain the benefits of multi-reward transfer across all possible tasks, without accruing compounding error."