The paper introduces a novel source model for determined multichannel blind audio source separation (MBASS) based on nonnegative block-term decomposition (NBTD). The key highlights are:
The NBTD-based source model defines blocks as outer products of vectors (clusters) and matrices, providing interpretable latent vectors and enabling straightforward integration of orthogonality constraints to ensure independence among source images.
Experimental results demonstrate that the proposed method, called cILRMA, outperforms existing ILRMA-based methods such as ILRMA, tILRMA, GGDILRMA, and mILRMA in anechoic conditions and surpasses the original ILRMA in simulated reverberant environments.
The performance of cILRMA improves with increasing values of the parameter O, which controls the number of blocks in the NBTD decomposition, suggesting that a higher O leads to a more accurate source model.
Compared to ILRMA, cILRMA consistently achieves around 4 dB higher SDR and SIR improvements, regardless of the number of bases used in the source model.
cILRMA converges within approximately 100 iterations to outperform ILRMA in terms of separation quality.
Overall, the proposed cILRMA method leverages the advantages of NBTD to effectively capture the intricate structure of multichannel audio signals, leading to superior blind source separation performance.
他の言語に翻訳
原文コンテンツから
arxiv.org
深掘り質問