The paper proposes DREAM, a visual decoding method that is grounded on the principles of the human visual system. DREAM aims to reverse the forward pathways from visual stimuli to fMRI recordings by designing specialized components to decipher semantics, color, and depth cues from the fMRI data.
The key components of DREAM are:
Reverse Visual Association Cortex (R-VAC): This component replicates the inverse operations of the visual association cortex to extract semantic information from the fMRI data, represented as CLIP embeddings.
Reverse Parallel PKM (R-PKM): This component simultaneously predicts color and depth cues from the fMRI signals, represented as spatial color palettes and depth maps.
Guided Image Reconstruction: The deciphered semantics, color, and depth cues are then used to guide the image reconstruction process using a frozen Stable Diffusion model with T2I-Adapter.
The experiments show that DREAM outperforms current state-of-the-art visual decoding methods in terms of the consistency of appearance, structure, and semantics of the reconstructed images.
לשפה אחרת
מתוכן המקור
arxiv.org
שאלות מעמיקות