The performance improvement brought by in-context learning (ICL) can be decomposed into three factors: label space regulation, label format regulation, and discrimination power. ICL exhibits significant efficacy in regulating the label space and format, but has limited impact on improving the model's discriminative capability.
Different concepts are learned at different layers of large language models, with more difficult concepts being fully acquired at deeper layers.
The LM Transparency Tool provides a comprehensive framework for tracing back the behavior of Transformer-based language models to specific model components, enabling detailed analysis and interpretation of the decision-making process.
PINOSE, a method that trains a probing model on offline self-consistency checking results, can efficiently and effectively detect non-factual content generated by large language models without relying on human-annotated data.
The authors propose a novel vocabulary-defined approach to analyze the semantics of language model latent space, which establishes a disentangled reference frame and enables effective model adaptation through semantic calibration.
Large language models exhibit key characteristics of human memory, such as primacy and recency effects, the influence of elaborations, and forgetting through interference rather than decay. These similarities suggest that the properties of human biological memory are reflected in the statistical structure of textual narratives, which is then captured by the language models.
Hallucinations in large language models can be effectively detected by analyzing the model's internal state transition dynamics during generation using tractable probabilistic models.
Applying phylogenetic algorithms to Large Language Models can reconstruct their evolutionary relationships and predict their performance on benchmarks, offering insights into model development and capabilities.
Language models integrate prior knowledge and new contextual information in predictable ways, relying more on prior knowledge for familiar entities and being more easily persuaded by some contexts than others.
Large language models exhibit similar preferences to humans in interpreting scope ambiguous sentences and are sensitive to the presence of multiple readings in such sentences.