Core Concepts
Decision-aware model learning is crucial for effective reinforcement learning algorithms, with latent models playing a vital role in achieving good performance.
Stats
"The MuZero loss function is biased in stochastic environments and has practical consequences."
"IterVAML leads to an unbiased solution in the infinite sample limit, even with deterministic world models."
"MuZero's joint model- and value function learning algorithm leads to a biased solution in stochastic environments."
Quotes
"The idea of decision-aware model learning has gained prominence in model-based reinforcement learning."
"We showcase design choices that enable well-performing algorithms."