Core Concepts
Comparison of the predictive performance of random forest models using different outcome operationalizations (binary, multinomial, survival, competing risks) to predict the 7-day risk of central line-associated bloodstream infection (CLABSI) in the presence of competing events (discharge, death).
Abstract
The study compared the performance of random forest (RF) models using different outcome types (binary, multinomial, survival, competing risks) to predict the 7-day risk of central line-associated bloodstream infection (CLABSI) in the presence of competing events (discharge, death). The models were built using data from 27,478 hospital admissions with 30,862 catheter episodes (970 CLABSI, 1,466 deaths, 28,426 discharges).
Key highlights:
- Binary, multinomial, and competing risks models had similar predictive performance, with AUROC up to 0.78 at day 5 of the catheter episode.
- Survival models that censored competing events at their occurrence time overestimated the CLABSI risk and had slightly lower discrimination.
- Keeping competing events in the risk set until the prediction horizon, as in Fine-Gray models, improved the survival model performance.
- Models using all outcome levels (multinomial, competing risks) had slightly higher AUPRC but were miscalibrated, likely due to optimizing the split statistic over multiple outcome classes.
- Binary and multinomial models had the lowest computation times.
- In the absence of censoring, complex modeling choices did not considerably improve predictive performance compared to a binary model for CLABSI prediction.
Stats
"Chemotherapy, antibiotics and CRP features are selected at first splits for models using all outcome levels, while binary and survival models favor TPN at early splits."
"The models using multiple outcome levels choose at the first splits in the trees antibiotics, other infection than BSI and ICU, while the other models favor TPN at early splits."