Introducing OLAPH, a novel framework that leverages cost-effective and multifaceted automatic evaluation to generate synthetic preference sets and optimize long-form answers from large language models to be more factual and coherent in the biomedical domain.
State-of-the-art fine-tuning approaches outperformed zero- and few-shot large language models in most biomedical NLP tasks, but closed-source LLMs like GPT-3.5 and GPT-4 achieved better performance in reasoning-related tasks and competitive accuracy in generation-related tasks.
The BiomedRAG framework effectively integrates retrieved chunk-based documents into large language models to enhance their performance across various biomedical NLP tasks, including information extraction, text classification, link prediction, and question answering.
This work presents CARE, a new dataset and annotation schema for extracting fine-grained experimental findings from biomedical literature, including clinical trials and case reports.
Leveraging prompt engineering, strategic in-context example selection, and external knowledge integration, this study demonstrates significant improvements in the performance of large language models for biomedical named entity recognition tasks.
Developing robust and dependable natural language inference (NLI) models for clinical trial data to support safer and more trustworthy AI assistance in healthcare decision-making.
Large language models (LLMs) can achieve strong performance on natural language inference tasks in the biomedical domain, but they still face challenges in maintaining consistency, faithfulness, and robust reasoning, especially when dealing with numerical and logical reasoning on clinical trial reports.
Comparing the performance of masked language models and generative language models on a natural language inference task for clinical trial data, focusing on metrics of faithfulness and consistency.
Supervised Fine-Tuned approaches, such as RoBERTa and BINDER (PubMedBERT), outperform general-purpose Large Language Models like ChatGPT on intent detection and named entity recognition tasks in the biomedical domain.
A model is proposed to effectively transfer knowledge from the biomedical domain to the chemical domain for named entity recognition, by projecting source and target entities into separate regions of the feature space.