The paper introduces a model-aware approach to decide when to utilize external retrieval for language model (LM) inference tasks, in order to save computational cost and maintain privacy.
The key highlights are:
The authors identify the privacy constraints inherent in retrieval-augmented LMs, and unveil the limitations of existing data-aware approaches that rely on accessing pre-training data.
They propose a novel model-aware approach that leverages the token embeddings intrinsic to the LM to determine whether retrieval augmentation is needed. This alleviates the dependency on the accessibility of pre-training data.
Extensive experiments and analyses demonstrate the superiority of the model-aware approach compared to the data-aware baseline, in terms of accuracy and adaptability to fine-tuned models.
The model-aware method circumvents the risks associated with maintaining pre-training data by only requiring access to the pre-trained token embeddings. This offers a safer and more straightforward way to judge the need for retrieval augmentation, with implications for real-world applications that need to balance efficiency and privacy.
To Another Language
from source content
arxiv.org
Thông tin chi tiết chính được chắt lọc từ
by Chengkai Hua... lúc arxiv.org 04-05-2024
https://arxiv.org/pdf/2404.03514.pdfYêu cầu sâu hơn