Exploring the Relationship Between In-Context Learning and Gradient Descent in Realistic NLP Tasks
There is little evidence for a strong correspondence between in-context learning and gradient descent optimization in realistic NLP tasks, despite recent claims. A layer-causal variant of gradient descent shows improved similarity to in-context learning, but the scores remain low, suggesting the need for a more nuanced understanding of the relationship.