Effective Internal Language Model Training and Fusion for Factorized Transducer Models Improves Automatic Speech Recognition Performance
The authors propose a novel internal language model (ILM) training and decoding strategy for factorized transducer models, which effectively combines the blank, acoustic, and ILM scores to achieve substantial performance improvements in automatic speech recognition.