This paper presents a novel approach to enhance image classification of esophagogastroduodenoscopy (EGD) images, which are used to assess the degree of cleanliness and visibility of the gastric mucosa. The authors tackle the challenges posed by small and imbalanced datasets, which are common in medical imaging, by leveraging synthetic data augmentation through class-specific Variational Autoencoders (VAEs).
The key aspects of the methodology are:
The authors evaluate the proposed approach using two prominent image classification architectures, EfficientNet-V2 and ResNet-50, on a dataset of 321 EGD images. The results demonstrate significant improvements in overall accuracy, precision, recall, and F1-score, with the most pronounced gains observed in the accuracy of the challenging underrepresented class (class 1). Specifically, the class 1 accuracy increased from 64.05% to 82.06% for EfficientNet-V2 and from 52.18% to 75.12% for ResNet-50 when using the combined real and synthetic data augmentation approach.
The authors also provide visual analysis showing the expansion of the feature space for each class after synthetic data augmentation, confirming the effectiveness of this technique in addressing data scarcity and imbalance. The proposed methodology represents a significant advancement in the field of medical image analysis, particularly for applications where annotated datasets are limited, and could potentially be applied to other medical imaging domains.
Till ett annat språk
från källinnehåll
arxiv.org
Djupare frågor