Основні поняття
The accuracy of in-context learning (ICL) in large language models (LLMs) for binary classification tasks is a complex interplay between pre-training knowledge, the number and quality of in-context examples, and potential dependencies among examples.
Статистика
When the example size k is smaller than the ratio of data variance to pre-training distribution variance (k ≤ σ²/σ²ₘ), ICL performs worse with contradicting knowledge compared to matching knowledge.
When k = 20, ICL accuracy for contradicting knowledge increases to around 87%.
As the fraction of positive examples (π) increases, ICL accuracy for positive inputs increases, reaching 100%, while accuracy for negative inputs decreases.
When the flipping probability in the negative class is fixed, a smaller flipping probability (higher pe+) in the positive class generally leads to higher accuracy in the positive class.
For overall accuracy greater than 80% in label noise experiments, both pe+ and pe− need to be close to 1.
Цитати
"A central challenge in the analysis lies in the formulation and integration of both priors of pre-training knowledge and examples into a single closed-form ICL prediction."
"Our work reveals the role of pre-training knowledge and examples in ICL, offering a deeper understanding of LLMs’ behaviors in classification tasks."
"This finding can help understand how LLMs consider dependency among tokens/sequences."