Core Concepts
A comprehensive hospital simulation environment, Agent Hospital, enables the evolution of medical agents powered by large language models to improve their diagnosis and treatment capabilities without manually labeled data.
Abstract
The paper introduces Agent Hospital, a comprehensive simulation environment that models the entire process of treating a patient's illness, including disease onset, triage, registration, consultation, medical examination, diagnosis, medicine dispensary, convalescence, and post-hospital follow-up visit. All patients, nurses, and doctors in the simulation are autonomous agents powered by large language models (LLMs).
The key innovation is the MedAgent-Zero strategy, which enables the doctor agents to self-evolve their medical capabilities within the simulation environment. The doctor agents accumulate experience from both successful and unsuccessful cases, and can continuously improve their performance on tasks like examination, diagnosis, and treatment recommendation.
Simulation experiments show that the doctor agents trained via MedAgent-Zero can handle tens of thousands of cases within just a few days, achieving high accuracy on examination (88%), diagnosis (95.6%), and treatment (77.6%) tasks. Interestingly, the knowledge the doctor agents acquire in the simulation is also applicable to real-world medical evaluation datasets - the evolved doctor agent achieves state-of-the-art accuracy of 93.06% on a subset of the MedQA dataset covering major respiratory diseases, without any manually labeled data.
This work demonstrates the potential of using comprehensive simulation environments and self-evolving agent techniques to advance the applications of LLM-powered agents in medical scenarios.
Stats
Doctor agents can handle tens of thousands of simulated patient cases within just a few days.
The doctor agents trained via MedAgent-Zero achieve 88% accuracy on examination tasks, 95.6% on diagnosis tasks, and 77.6% on treatment recommendation tasks.
The evolved doctor agent achieves state-of-the-art accuracy of 93.06% on a subset of the MedQA dataset covering major respiratory diseases, without any manually labeled data.
Quotes
"To the best of our knowledge, this is the first simulacrum of hospital, which comprehensively reflects the entire medical process with excellent scalability, making it a valuable platform for the study of medical LLMs/agents."
"Based on this virtual environment, we propose the MedAgent-Zero strategy that is designed for the self-evolution of medical agents without manually labeled data."
"In experiments with simulated cases, MedAgent-Zero can handle tens of thousands of cases within several days (human doctors may take over two years) and demonstrates powerful performance."