Core Concepts
The ensemble deep learning algorithm developed through the ISLES'22 challenge can accurately detect and segment ischemic stroke lesions across diverse clinical and imaging scenarios, outperforming individual state-of-the-art algorithms and achieving performance comparable to expert neuroradiologists.
Abstract
The study presents the development and evaluation of a robust ensemble deep learning algorithm for ischemic stroke lesion segmentation, derived from the ISLES'22 challenge.
Key highlights:
The ISLES'22 challenge provided a large, diverse dataset of 400 patient scans from multiple medical centers, enabling the development of generalizable algorithms.
The ensemble algorithm combines the strengths of top-performing individual algorithms from the challenge, achieving superior ischemic lesion detection and segmentation accuracy (median Dice score: 0.82, median lesion-wise F1 score: 0.86) compared to individual solutions.
The ensemble algorithm demonstrates strong generalization across diverse imaging centers, lesion sizes, stroke phases, lesion patterns, and vascular territories affected.
In a Turing-like test, neuroradiologists consistently preferred the algorithm's segmentations over manual expert efforts, highlighting its increased comprehensiveness and precision.
Validation on a large external dataset (N=1686) confirmed the ensemble algorithm's generalizability and its ability to derive clinically relevant biomarkers, such as lesion volumes, that correlate well with clinical stroke scores.
The study showcases the potential of challenge-derived algorithms to extend beyond the initial challenge objectives and demonstrate real-world clinical applicability for improved stroke diagnosis and patient care.
Stats
Ischemic stroke lesions smaller than 5 ml have a median volume of 0.9 ml, those between 5-20 ml have a median of 24.5 ml, and those larger than 20 ml have a median of 137.9 ml.
The ensemble algorithm achieves a Pearson correlation of 0.97 between the estimated lesion volumes and the manually delineated ones in the external dataset.
The ensemble algorithm's estimated lesion volumes show slightly higher correlations with the National Institutes of Health Stroke Scale (NIHSS) at admission (r=0.55) and the modified Rankin Scale (mRS) at 90-day follow-up (r=0.41) compared to the manually delineated lesion volumes (NIHSS r=0.54, mRS r=0.39).
Quotes
"The ensemble algorithm exhibits statistically significantly higher ratings than the experts (p-value = 0.02 when considering the segmentation completeness and p-value < 0.001 when considering the segmentation correctness, Wilcoxon signed-rank tests)."
"The performance achieved on the external Johns Hopkins dataset closely aligns with the results obtained on the ISLES'22 test set. Specifically, the median ± interquartile range Dice scores and lesion detection F1 scores are 0.82 ± 0.15 and 0.86 ± 0.33, respectively."