Core Concepts
Introducing a position-aware parameter efficient fine-tuning approach to mitigate the inherent positional bias in pre-trained large language models.
Abstract
The paper investigates the phenomenon of positional bias in large language models (LLMs) across various tasks that require retrieving relevant knowledge from extensive input contexts. Through empirical studies, the authors demonstrate that current LLMs exhibit distinct "positional preferences" in their predictions, rather than the previously reported "lost-in-the-middle" phenomenon.
The authors show that simply employing prompt-based solutions, such as few-shot learning or hierarchical inference, is insufficient to address the positional bias issue. To mitigate this problem, the authors propose a two-pronged approach:
Data Augmentation: The authors introduce a data augmentation technique that involves randomly permuting the order of candidate documents within the input context. This encourages LLMs to distribute their attention more uniformly across different positions.
Position-Aware Parameter Efficient Fine-Tuning (PAPEFT): The authors introduce a novel adapter module that explicitly incorporates the relative positions of documents as additional input prompts. This position-aware adapter module is then used to fine-tune the pre-trained LLM in a parameter-efficient manner.
The authors evaluate the proposed PAPEFT framework on recommendation and link prediction tasks, using Longchat-13b-16k and Vicuna-13b-v1.5-16k as the base models. The results demonstrate that the PAPEFT framework can substantially reduce the performance fluctuations across different positions of relevant information, achieving an average decrease in variance of over 54% compared to the original models. Additionally, the PAPEFT framework also enhances the overall task performance by an average of 57.3% and 64.4% for the recommendation and link prediction tasks, respectively.
Stats
The paper does not provide any specific numerical data or statistics in the main text. The focus is on the conceptual approach and the overall performance improvements.
Quotes
The paper does not contain any direct quotes that are particularly striking or support the key arguments.