A single multi-task learning model "UniverSLU" can perform various speech classification and sequence generation tasks, often outperforming or matching state-of-the-art task-specific models. UniverSLU leverages natural language instructions as prompts to enhance user-friendliness and generalization.
A pipeline leveraging Large Language Models (LLMs) for machine translation of slot-annotated spoken language understanding (SLU) training data can effectively extend SLU systems to new languages, outperforming existing state-of-the-art methods.
This paper presents an enhanced version of the MEDIA benchmark dataset for French Spoken Language Understanding (SLU), with newly added intent annotations. It also provides baseline results for joint intent classification and slot-filling models on this enhanced dataset, using both manual transcriptions and automatic speech recognition outputs.