Only the Per-Channel Energy Normalisation (PCEN) layer of the LEArnable Front-end (LEAF) model learns during training, while the Gabor filterbank and Gaussian low-pass filters remain unchanged. Adapting the PCEN layer using a small amount of noisy data can improve the performance of a LEAF model trained on clean speech when deployed in noisy environments.
The networks learn to denoise and dereverberate the microphone signals to better correlate them and consequently estimate the source position.