Core Concepts
LEME, a new open-source large language model specifically trained on a vast dataset of ophthalmology-related text, outperforms existing general and medical LLMs in various tasks, showing promise for revolutionizing clinical workflows and research in eye care.
Stats
LEME was fine-tuned on a corpus of ~127,000 non-copyrighted training instances.
The training data was curated from ophthalmology-specific case reports, abstracts, and open-source study materials.
The study benchmarked LEME against eight other LLMs, including GPT-3.5, GPT-4, three Llama2 models (7B, 13B, 70B), PMC-LLAMA 13B, Meditron 70B, and EYE-Llama.
In internal validations, LEME achieved Rouge-L scores of 0.20 ± 0.03 in abstract completion, 0.82 ± 0.04 in fill-in-the-blank, and 0.22 ± 0.05 in short-answer QA.
In external validations, LEME excelled in long-form QA with a Rouge-L of 0.19 ± 0.01, ranked second in MCQ accuracy (0.68 ± 0.09), and scored highest in EHR summarization and clinical QA (ranging from 4.24 to 4.83 out of 5 for correctness, completeness, and readability).
Quotes
"LEME's emphasis on robust fine-tuning and the use of non-copyrighted data represents a breakthrough in open-source ophthalmology-specific LLMs, offering the potential to revolutionize execution of clinical tasks while democratizing research collaboration."
"LEME signifies a significant breakthrough in the field, with the potential to transform patient query services, clinical workflows, and the delivery of eye care services."