Zhu, V., Ji, Z., Guo, D., Wang, P., Xia, Y., Lu, L., Ye, X., Zhu, W., & Jin, D. (2024). Low-Rank Continual Pyramid Vision Transformer: Incrementally Segment Whole-Body Organs in CT with Light-Weighted Adaptation. arXiv preprint arXiv:2410.04689.
This research aims to address the challenge of continual semantic segmentation (CSS) in medical imaging, specifically focusing on developing a method that enables pre-trained deep learning models to dynamically expand their segmentation capabilities to new organs without requiring access to previous training data.
The researchers propose a novel architecture-based CSS method called LoCo-PVT, which utilizes a pre-trained 3D Pyramid Vision Transformer (PVT) as the backbone and incorporates Low-Rank Adaptation (LoRA) to incrementally adapt the model for new organ segmentation tasks. The PVT backbone is initially trained on a large dataset (TotalSegmentator) and then frozen. For subsequent datasets with new organs, LoRA modules are introduced in specific layers of the PVT, allowing for parameter-efficient fine-tuning without modifying the pre-trained weights. The method is evaluated on four datasets covering different body parts, with a total of 121 organs.
The study demonstrates the efficacy of combining a pre-trained PVT with LoRA for continual whole-body organ segmentation. The proposed LoCo-PVT method effectively addresses the challenges of catastrophic forgetting and model parameter explosion, enabling the incremental learning of new organs without compromising the segmentation accuracy of previously learned structures.
This research contributes to the advancement of continual learning in medical image segmentation, offering a practical and efficient solution for developing dynamically extensible models. The proposed LoCo-PVT framework has the potential to facilitate the development of more versatile and adaptable clinical tools for automated organ segmentation in various clinical applications.
다른 언어로
소스 콘텐츠 기반
arxiv.org
더 깊은 질문