Concetti Chiave
Proposing the In-context Autoencoder (ICAE) to compress long contexts efficiently in large language models.
Sintesi
The article introduces the In-context Autoencoder (ICAE) as a solution to compress long contexts efficiently in large language models. ICAE leverages autoencoding and language modeling objectives to generate memory slots that represent the original context accurately. Pretraining on massive text data enables ICAE to achieve 4× context compression based on Llama, improving latency and GPU memory cost during inference. The results suggest a novel perspective on working memory in cognitive science and representation learning in LLMs, emphasizing scalability and potential implications for further research.
As artificial intelligence advances, concerns about ethical use and public perception arise among tech companies like Google, Amazon, Microsoft, Facebook, and IBM. Long context modeling poses challenges for Transformer-based LLMs due to their self-attention mechanism. Previous research focuses on architectural innovations to tackle long context issues but struggles with performance on lengthy contexts. In contrast, ICAE approaches the problem through context compression using a learnable encoder adapted from LLM with LoRA for encoding into memory slots.
Experiments demonstrate ICAE's effectiveness in compressing contexts while maintaining accuracy. The model is fine-tuned on instruction data for practical scenarios, enhancing interaction with prompts. Results show improved autoencoding performance of pretrained ICAE without instruction fine-tuning. Text continuation evaluation indicates more pronounced losses with higher compression ratios in language modeling tasks.
Statistiche
Introducing about 1% additional parameters
Achieving 4× context compression based on Llama
Improved latency and GPU memory cost during inference
Citazioni
"Context compression is motivated by the fact that a text can be represented in different lengths in an LLM while conveying the same information."
"We propose In-context Autoencoder (ICAE) – a novel approach to context compression by leveraging the power of an LLM."
"All these results imply a novel perspective on the connection between working memory in cognitive science and representation learning in LLMs."