Automatically extracting Interlinear Glossed Text (IGT) annotations from speech is crucial for preserving endangered languages and facilitating language documentation.