Leveraging Unlabeled Data to Enhance Fine-Tuning of Large Language Models
Selecting the most informative unlabeled data samples to pre-fine-tune a pre-trained language model can significantly improve its performance on target tasks, while minimizing the need for costly domain-specific labeled data.