Core Concepts
Carefully scaling up model and data size can lead to significant improvements in imitation learning performance for single-agent games, as demonstrated in Atari games and NetHack.
Abstract
In this study, the authors explore the impact of scaling up model and data size on imitation learning performance in single-agent games. They find that loss and mean return follow power law trends with respect to FLOPs, showing predictable improvements. The findings suggest a promising path towards training increasingly capable agents for challenging games like NetHack.
The study focuses on demonstrating how scaling laws apply to imitation learning agents trained with behavioral cloning (BC) in both Atari games and NetHack. By analyzing isoFLOP profiles, the authors show clear correlations between model size, number of samples, loss, and mean return with compute budget (FLOPs). The results indicate that improvements in loss translate into better performing agents in the environment.
The research extends its analysis to reinforcement learning (RL) briefly, finding similar power law trends for model size and number of interactions in NetHack. By forecasting compute-optimal BC agents for NetHack, the study shows significant performance improvements compared to prior state-of-the-art approaches. Overall, the findings highlight the potential benefits of scaling up model and data size for training more capable game agents.
Stats
We find clear parabolas with well-defined minima at the optimal model size for a given compute budget.
Loss-optimal data points are used to fit regressions regressing log parameters on log FLOPs.
Power laws are derived for loss-optimal model size, number of training samples, and minimal validation loss.
The average return follows a power law trend with respect to optimal cross-entropy loss.
Quotes
"Scaling up could have unknown unintended consequences."
"While we do not see a direct path towards any negative applications."
"The results suggest a promising path towards increasingly capable game agents."