Conceitos essenciais
Addressing the challenges of noisy audio in biomedical NER tasks through innovative dataset creation and GPT4-based transcript cleaning.
Estatísticas
Automatic Speech Recognition (ASR) technology is pivotal in converting spoken language into written text.
Named Entity Recognition (NER) is vital for extracting biomedical entities from noisy audio transcripts.
BioASR-NER dataset offers clean and noisy recordings for improved understanding of ASR-NLP gap.
GPT4 is used for transcript cleaning and improving NER performance.
Models show significant improvement in NER performance with GPT4 interventions.
Citações
"Automatic Speech Recognition (ASR) technology is fundamental in transcribing spoken language into text."
"This paper introduces a novel dataset, BioASR-NER, designed to bridge the ASR-NLP gap in the biomedical domain."
"Our study further delves into an error analysis, shedding light on the types of errors in transcription software."