The paper introduces SPMamba, a novel speech separation architecture that integrates State Space Models (SSMs) into the TF-GridNet framework. The key contributions are:
The paper first provides background on Mamba, a novel SSM-based method, and its advantages over CNN-based and Transformer-based models. It then introduces the SPMamba architecture, which replaces the Transformer component of TF-GridNet with a bidirectional Mamba module to capture long-range dependencies more effectively.
The authors construct a multi-speaker speech separation dataset with reverberation and noise, and conduct comprehensive experiments to evaluate the performance of SPMamba against various state-of-the-art speech separation models. The results show that SPMamba outperforms all other compared models in terms of SDR(i) and SI-SNR(i) metrics, while maintaining significantly lower computational complexity.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Kai Li,Guo C... at arxiv.org 04-03-2024
https://arxiv.org/pdf/2404.02063.pdfDeeper Inquiries