A Two-stage Framework for Robust Speech Emotion Recognition by Extracting Target Speaker from Human Speech Noise
A novel two-stage framework that cascades target speaker extraction and speech emotion recognition to mitigate the impact of human speech noise on emotion recognition performance.