The paper introduces Wills Aligner, a robust multi-subject brain representation learning approach for visual decoding tasks. The key components are:
Voxel Alignment: The method uses anatomical alignment to address the anatomical differences across subjects, registering fMRI data to a standardized brain template.
Mixture of Brain Experts (MoBE): MoBE is a plugin network that enhances the backbone model's ability to learn various cognition patterns across subjects. It consists of multiple brain experts and a global router to determine the appropriate expert for each subject.
Learning Strategy: The multi-subject learning is decoupled into two stages - first, the backbone network learns the inter-subject commonality knowledge by aligning the fMRI representations to the semantic structure of visual stimuli. Then, the MoBE plugin learns the various cognition patterns for individual subjects.
The experiments on the Natural Scenes Dataset (NSD) demonstrate that Wills Aligner achieves state-of-the-art performance in both coarse-grained visual classification and fine-grained visual retrieval tasks, outperforming existing single-subject and multi-subject methods. It also shows strong few-shot learning capabilities, effectively leveraging data from other subjects to boost the performance of subjects with limited training data.
A otro idioma
del contenido fuente
arxiv.org
Ideas clave extraídas de
by Guangyin Bao... a las arxiv.org 04-23-2024
https://arxiv.org/pdf/2404.13282.pdfConsultas más profundas