The paper focuses on the task of audio inpainting, which aims to fill in missing parts of an audio signal. The authors first revisit a recent deep learning-based approach called Deep Prior Audio Inpainting (DPAI) and propose several modifications to improve its performance.
The main contribution of the paper is the adaptation of the Janssen algorithm, a state-of-the-art time-domain audio inpainting method, to the time-frequency domain. This novel method, called Janssen-TF, models the audio signal as an autoregressive process and estimates the missing time-frequency coefficients by minimizing the norm of the model error subject to the observed data.
The authors compare the performance of Janssen-TF, the DPAI variants, and the original Janssen time-domain method using both objective metrics (signal-to-noise ratio and objective difference grade) and a subjective listening test. The results show that Janssen-TF significantly outperforms the competing methods in all the considered measures, except for the longest gaps.
The paper also discusses the computational complexity of the proposed methods, with Janssen-TF being more efficient than the deep learning-based DPAI approach.
Na inny język
z treści źródłowej
arxiv.org
Głębsze pytania