Core Concepts
Leveraging pre-trained speaker verification models to effectively generalize across multiple languages and detect duplicate participants in cognitive and mental health clinical trials.
Abstract
The paper proposes using pre-trained speaker verification (SV) models to enroll and verify patients in cognitive and mental health clinical trials in zero-shot settings, across multiple languages.
The key highlights are:
The authors evaluate three state-of-the-art SV models (SpeakerNet, TitaNet, ECAPA-TDNN) on speech data from patients with Alzheimer's disease, mild cognitive impairment, and schizophrenia, speaking in English, German, Danish, Spanish, and Arabic.
The results demonstrate that the tested models can effectively generalize to clinical speakers, achieving less than 2.7% Equal Error Rate (EER) for European languages and 8.26% EER for Arabic.
This represents a significant step in developing versatile and efficient SV systems for cognitive and mental health clinical trials that can be used across a wide range of languages and dialects, substantially reducing the effort required to deploy such systems.
The authors also evaluate how speech tasks and the number of speakers involved in the trial influence the SV performance, showing that the type of speech tasks impacts the model performance.
Stats
7.78% of patients participating in large clinical trials were duplicated across different sites. [37]
The authors' models achieve less than 2.7% EER for European languages and 8.26% EER for Arabic in zero-shot settings.
Quotes
"Due to the substantial number of clinicians, patients, and data collection environments involved in clinical trials, gathering data of superior quality poses a significant challenge."
"We propose using these speech recordings to verify the identities of enrolled patients and identify and exclude the individuals who try to enroll multiple times in the same trial."
"Our results demonstrate that tested models can effectively generalize to clinical speakers, with less than 2.7% EER for European Languages and 8.26% EER for Arabic."