The paper introduces a model-aware approach to decide when to utilize external retrieval for language model (LM) inference tasks, in order to save computational cost and maintain privacy.
The key highlights are:
The authors identify the privacy constraints inherent in retrieval-augmented LMs, and unveil the limitations of existing data-aware approaches that rely on accessing pre-training data.
They propose a novel model-aware approach that leverages the token embeddings intrinsic to the LM to determine whether retrieval augmentation is needed. This alleviates the dependency on the accessibility of pre-training data.
Extensive experiments and analyses demonstrate the superiority of the model-aware approach compared to the data-aware baseline, in terms of accuracy and adaptability to fine-tuned models.
The model-aware method circumvents the risks associated with maintaining pre-training data by only requiring access to the pre-trained token embeddings. This offers a safer and more straightforward way to judge the need for retrieval augmentation, with implications for real-world applications that need to balance efficiency and privacy.
Naar een andere taal
vanuit de broninhoud
arxiv.org
Belangrijkste Inzichten Gedestilleerd Uit
by Chengkai Hua... om arxiv.org 04-05-2024
https://arxiv.org/pdf/2404.03514.pdfDiepere vragen