toplogo
Sign In

Evaluating the Adaptability of a Shared Foundation Model for Electronic Health Records Across Multiple Healthcare Institutions


Core Concepts
Adapting a pretrained electronic health records foundation model across hospitals can improve prediction performance at less cost, underscoring the effectiveness of sharing base foundation models as modular machine learning components to streamline the development of healthcare AI.
Abstract
This multi-center study evaluated the adaptability of a recently released structured electronic health records (EHR) foundation model (FMSM) trained on data from Stanford Medicine. Experiments were conducted using EHR data from The Hospital for Sick Children (SickKids) and Medical Information Mart for Intensive Care (MIMIC-IV). The key findings are: Adapting the off-the-shelf FMSM matched the performance of gradient boosting machine (GBM) models locally trained on all available data while providing a 13% improvement in settings with few task-specific training labels. With continued pretraining on local data, label efficiency substantially improved, such that FMSM required fewer than 1% of training examples to match the fully trained GBM's performance. Continued pretraining was also 60 to 90% more sample-efficient than training local foundation models from scratch. The external foundation model (FMSM) displayed robust performance across both a Canadian pediatric cohort (SickKids) and an American adult ICU-based cohort (MIMIC), indicating that pretraining on a larger and more diverse patient population improves the adaptability of the foundation model across healthcare settings. The findings provided insights into when it is beneficial to adapt an existing EHR foundation model vs. pretraining from scratch, depending on data availability. Overall, the study demonstrates that adapting shared EHR foundation models across hospitals provides improved prediction performance at less cost, underscoring the utility of base foundation models as modular components to streamline the development of healthcare AI.
Stats
In-hospital mortality rate was 0.6% in SickKids and 3.6% in MIMIC. Long length of stay (≥7 days) occurred in 16.1% of SickKids and 27.7% of MIMIC patients. 30-day readmission rate was 6.0% in SickKids and 0.6% in MIMIC. Hypoglycemia occurred in 1.2% of SickKids and 1.6% of MIMIC patients. Hyponatremia occurred in 0.2% of SickKids and 0.9% of MIMIC patients. Hyperkalemia occurred in 0.9% of both SickKids and MIMIC patients. Thrombocytopenia occurred in 1.9% of SickKids and 3.1% of MIMIC patients. Anemia occurred in 2.9% of SickKids and 6.8% of MIMIC patients.
Quotes
"Adapting a pretrained electronic health records foundation model across hospitals can improve prediction performance at less cost, underscoring the effectiveness of sharing base foundation models as modular machine learning components to streamline the development of healthcare AI." "With continued pretraining on local data, label efficiency substantially improved, such that FMSM required fewer than 1% of training examples to match the fully trained GBM's performance." "The external foundation model (FMSM) displayed robust performance across both a Canadian pediatric cohort (SickKids) and an American adult ICU-based cohort (MIMIC), indicating that pretraining on a larger and more diverse patient population improves the adaptability of the foundation model across healthcare settings."

Deeper Inquiries

How can the adaptability and generalizability of EHR foundation models be further improved to address potential biases and ensure equitable performance across diverse patient populations?

To enhance the adaptability and generalizability of EHR foundation models, several strategies can be implemented: Diverse Training Data: Incorporating a more diverse and representative dataset during the pretraining phase can help mitigate biases and improve the model's ability to generalize across different patient populations. This can involve including data from various demographics, geographic locations, and healthcare settings to ensure a more comprehensive representation. Fairness and Bias Mitigation Techniques: Implementing fairness-aware learning techniques and bias mitigation strategies can help address potential biases in the model. This includes regular audits of the model's performance across different subgroups to identify and rectify any disparities. Interpretability and Explainability: Enhancing the interpretability of the model by incorporating explainable AI techniques can help in understanding the model's decisions and identifying any biases that may exist. This transparency can aid in ensuring equitable performance and addressing potential biases. Continuous Monitoring and Evaluation: Regularly monitoring the model's performance post-deployment and evaluating its outcomes across diverse patient populations can help in identifying any biases or disparities that may arise. This ongoing evaluation process is crucial for ensuring equitable performance. Collaboration and Stakeholder Involvement: Involving diverse stakeholders, including clinicians, ethicists, and patients, in the development and evaluation of EHR foundation models can provide valuable insights into potential biases and ensure that the model is adapted to address the needs of diverse patient populations. By implementing these strategies, the adaptability and generalizability of EHR foundation models can be enhanced to address potential biases and ensure equitable performance across diverse patient populations.

What are the potential risks and ethical considerations associated with sharing and adapting EHR foundation models, and how can these be effectively mitigated?

Sharing and adapting EHR foundation models come with several potential risks and ethical considerations that need to be addressed: Privacy and Data Security: One of the primary concerns is the protection of patient privacy and ensuring data security when sharing EHR data. Adhering to strict data protection regulations, such as de-identification of data and encryption during transmission, can help mitigate these risks. Bias and Fairness: Shared models may inherit biases present in the training data, leading to unfair outcomes for certain patient groups. Regular bias audits, fairness-aware training, and post-deployment monitoring can help mitigate these risks and ensure equitable performance. Transparency and Accountability: Ensuring transparency in the model's decision-making process and establishing accountability mechanisms for any adverse outcomes are essential. Providing explanations for model predictions and establishing clear lines of responsibility can help address ethical concerns. Informed Consent and Data Governance: Obtaining informed consent from patients for the use of their data in model training and ensuring robust data governance practices are crucial. Patients should be informed about how their data will be used and have control over its usage. Regulatory Compliance: Adhering to regulatory requirements and standards, such as HIPAA in the United States or GDPR in the European Union, is essential when sharing and adapting EHR foundation models. Compliance with these regulations helps protect patient rights and data privacy. Mitigating these risks involves a combination of technical, legal, and ethical measures, including robust data governance practices, transparency in model development, and ongoing monitoring for biases and fairness.

How can the minimal schema requirements for training EHR foundation models be defined to reduce the costs associated with mapping to a common data model, and what are the implications for broader adoption and interoperability?

Defining minimal schema requirements for training EHR foundation models can help reduce costs and facilitate broader adoption and interoperability: Standardization: Establishing standardized data formats and vocabularies for EHR data can simplify the mapping process and ensure interoperability across different systems. Using common data models, such as OMOP CDM, can help streamline the training of foundation models. Data Harmonization Tools: Developing tools and resources for data harmonization can aid in aligning disparate data sources to a common schema. Automated mapping algorithms and data transformation pipelines can reduce the manual effort required for schema mapping. Open Data Sharing: Encouraging open data sharing initiatives within the healthcare community can promote the development of shared datasets that adhere to common schema standards. This can lower the barrier to entry for researchers and institutions looking to train EHR foundation models. Community Collaboration: Collaborating with industry stakeholders, healthcare providers, and regulatory bodies to define and promote minimal schema requirements can foster a consensus on data standards. This collaboration can lead to increased adoption of common data models and facilitate data sharing for model training. Cost-Effective Solutions: Investing in cost-effective solutions for data mapping and schema alignment, such as cloud-based data integration platforms or open-source tools, can help reduce the financial burden associated with training EHR foundation models. Leveraging existing resources and infrastructure can optimize the training process. By defining minimal schema requirements, promoting standardization, and fostering collaboration within the healthcare community, the costs associated with mapping to a common data model can be minimized. This, in turn, can enhance interoperability, facilitate broader adoption of EHR foundation models, and drive advancements in healthcare AI.
0