Evaluating the Performance of Instruction-Finetuned Large Language Models on Clinical and Biomedical Tasks
Instruction-finetuned large language models like ChatGPT, Flan-T5 UL2, Tk-Instruct, and Alpaca can approach the performance of state-of-the-art models in zero-shot and few-shot scenarios for various clinical and biomedical NLP tasks, particularly excelling in question-answering.