Whispy is a system designed to bring live transcription capabilities to the state-of-the-art Whisper speech recognition models, enabling real-time processing of audio streams while maintaining high accuracy.
Evaluating the performance of different end-to-end automatic speech recognition models and audio splitting algorithms for generating real-time transcriptions, including their impact on transcription quality and end-to-end delay.