The 2nd BabyLM Challenge: Fostering Sample-Efficient Pretraining on Developmentally Plausible Language Corpora
The 2nd BabyLM Challenge aims to incentivize researchers to focus on optimizing language model pretraining given data limitations inspired by human language development, and to democratize research on pretraining by addressing open problems that can be tackled on a university budget.