Core Concepts
L-Tuning, an efficient fine-tuning approach, leverages the semantic knowledge of pre-trained large language models to enhance classification performance and training efficiency compared to traditional prompt and prefix tuning methods.
Abstract
The paper introduces L-Tuning, an innovative approach to fine-tuning large language models (LLMs) for classification tasks within the Natural Language Inference (NLI) framework.
Key highlights:
Traditional prompt and prefix tuning methods rely on arbitrary tokens for training, leading to prolonged training times and suboptimal performance due to lack of semantic differentiation among classes.
L-Tuning addresses these issues by focusing on the fine-tuning of label tokens processed through the pre-trained LLM, effectively utilizing its pre-existing semantic knowledge.
For prefix tuning, L-Tuning derives prefix embeddings directly from label tokens, applying a self-attention pooling mechanism to transform them into a suitable form for the classification head.
For prompt tuning, L-Tuning generates label embeddings through a trainable transformation function, which are then concatenated with text embeddings for classification.
Experimental results across various datasets and LLMs, including BERT, RoBERTa, DeBERTa, Falcon, Bloom, and Llama-2, demonstrate that L-Tuning significantly outperforms traditional prompt and prefix tuning in terms of training efficiency and classification accuracy, particularly for large language models.
The authors highlight that L-Tuning's efficacy is more pronounced in the context of LLMs, showcasing its potential as a scalable and efficient approach to optimizing advanced language processing systems.
Stats
L-Tuning demonstrated a modest improvement of 0-2% for standard language models like BERT and RoBERTa, but its impact was more pronounced in large language models like Bloom and Llama-2, showing improvements of 2-6%.
Quotes
"L-Tuning, an efficient fine-tuning approach designed for classification tasks within the Natural Language Inference (NLI) framework."
"Empirical evidence suggests that L-Tuning significantly outperforms conventional prompt and prefix tuning in LLMs, both in terms of reducing training time and enhancing performance in classification tasks."