Core Concepts
One-Prompt Segmentation combines the strengths of one-shot and interactive segmentation methods to enable zero-shot generalization across diverse medical imaging tasks, requiring only a single prompted sample during inference.
Abstract
The paper introduces a novel paradigm called "One-Prompt Segmentation" for universal medical image segmentation. The key idea is to train a foundation model that can adapt to unseen tasks by leveraging a single prompted sample during inference, without the need for retraining or fine-tuning.
The authors first gather a large-scale dataset of 78 open-source medical imaging datasets, covering a wide range of organs, tissues, and anatomies. They then train the One-Prompt Model, which consists of an image encoder and a sequence of One-Prompt Former modules as the decoder. The One-Prompt Former efficiently integrates the prompted template feature with the query feature at multiple scales.
The paper also introduces four different prompt types - Click, BBox, Doodle, and SegLab - to cater to the diverse needs of medical image segmentation tasks. These prompts are annotated by a team of clinicians across the dataset.
The authors extensively evaluate the One-Prompt Model on 14 previously unseen medical imaging tasks, demonstrating its superior zero-shot segmentation capabilities compared to a wide range of related methods, including few-shot and interactive segmentation models. The model exhibits robust performance and stability when provided with different prompted templates during inference.
The paper highlights the significant practical benefits of the One-Prompt Segmentation approach, including its user-friendly interface, cost-effectiveness, and potential for building automatic pipelines in clinical settings.
Stats
The paper trains the One-Prompt Model on 64 open-source medical imaging datasets and evaluates it on 14 previously unseen datasets.
The authors collected over 3,000 clinician-labeled prompts across the datasets.
Quotes
"One-Prompt Segmentation combines the strengths of one-shot and interactive methods to meet the real clinical requirements."
"Our model is trained on 64 datasets, with clinicians prompting a part of the data."
"One-Prompt Segmentation learns a more general function y = fθ(xd_j, kd) performing on any task d, where kd = {xd_c, pd_c} comprising one fixed template image xd_c and a paired prompt pd_c available for task d."