Kernekoncepter
Adding auxiliary loss functions to guide attention mechanisms improves speaker diarization accuracy in the EEND-EDA model.
Resumé
Abstract:
- EEND-EDA model uses attractors for dynamic speaker recognition.
- Proposed auxiliary loss function enhances self-attention in Transformer encoders.
- Results show a decrease in Diarization Error Rate from 30.95% to 28.17%.
Introduction:
- Speaker diarization importance and applications discussed.
- Traditional clustering algorithms limitations highlighted.
End-to-end Neural Diarization:
- EEND architecture overview with Bi-LSTM and PIT training.
- Challenges faced by traditional systems in overlapping speech scenarios.
Encoder-Decoder Attractor Model:
- EDA framework explained for speaker activity estimation.
- Modifications like RX-EEND for improved performance discussed.
Auxiliary Loss Function:
- Introduction of a guiding loss function to enhance attention weights diversity.
Data Extraction:
E-mail: {60947089s, 40947006s, berlin }@ntnu.edu.tw
DER 錯誤率從30.95%降低至28.17%
Citater
"Proposed auxiliary loss function aims to guide Transformer encoders at lower layers."
"EEND-EDA model shows effectiveness in reducing Diarization Error Rate."