How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression?
統計資料
Transformers pretrained on diverse tasks exhibit remarkable in-context learning capabilities.
Effective pretraining only requires a small number of independent tasks.