Efficient Multilingual DistilWhisper for Speech Models
The author proposes the DistilWhisper approach to bridge the performance gap in ASR for under-represented languages by utilizing lightweight modular ASR fine-tuning and knowledge distillation from a larger model. This dual approach effectively boosts ASR performance while maintaining robustness inherited from multitask and multilingual pre-training.