Data augmentation techniques can improve the performance of large language models on dialectal commonsense reasoning tasks.
Linguistic variation poses significant challenges for language models, requiring careful consideration of data characteristics and model capabilities to facilitate effective adaptation.
Linear weight interpolation between fine-tuned language models can be used to dynamically generate text with predictable and fine-grained control over multiple stylistic attributes simultaneously.
This paper presents a comprehensive methodology for adapting large language models to new languages, demonstrating state-of-the-art results across 9 diverse languages and 2 model scales.
This paper explores cost-efficient methods to adapt the Llama 2 language model to the Estonian language, leveraging cross-lingual instruction-tuning and additional monolingual pretraining.
Proxy-tuning is a lightweight decoding-time algorithm that can efficiently customize large pretrained language models without accessing their internal weights, by leveraging small tuned models as "experts" to guide the predictions of the larger base model.