The paper focuses on the task of audio inpainting, which aims to fill in missing parts of an audio signal. The authors first revisit a recent deep learning-based approach called Deep Prior Audio Inpainting (DPAI) and propose several modifications to improve its performance.
The main contribution of the paper is the adaptation of the Janssen algorithm, a state-of-the-art time-domain audio inpainting method, to the time-frequency domain. This novel method, called Janssen-TF, models the audio signal as an autoregressive process and estimates the missing time-frequency coefficients by minimizing the norm of the model error subject to the observed data.
The authors compare the performance of Janssen-TF, the DPAI variants, and the original Janssen time-domain method using both objective metrics (signal-to-noise ratio and objective difference grade) and a subjective listening test. The results show that Janssen-TF significantly outperforms the competing methods in all the considered measures, except for the longest gaps.
The paper also discusses the computational complexity of the proposed methods, with Janssen-TF being more efficient than the deep learning-based DPAI approach.
Sang ngôn ngữ khác
từ nội dung nguồn
arxiv.org
Thông tin chi tiết chính được chắt lọc từ
by Ondř... lúc arxiv.org 09-11-2024
https://arxiv.org/pdf/2409.06392.pdfYêu cầu sâu hơn