Core Concepts
This study proposes a novel stochastic transformer-based deep learning approach that achieves state-of-the-art performance for automated detection of post-traumatic stress disorder (PTSD) using audio recordings of clinical interviews.
Abstract
The key highlights and insights from the content are:
Post-traumatic stress disorder (PTSD) is a mental disorder that can develop after experiencing traumatic events. Current diagnosis methods using self-report questionnaires have several limitations, including introspective ability, rating scale bias, memory biases, and response bias.
The authors propose a deep learning-based approach for automated PTSD detection using audio recordings of clinical interviews. The approach is based on extracting Mel-Frequency Cepstrum Coefficient (MFCC) features from the audio data, followed by processing using a novel stochastic transformer model.
The stochastic transformer model incorporates several stochastic components, including stochastic depth, stochastic deep learning layers, and a stochastic activation function (GeLU). These stochastic elements help improve the model's robustness and performance.
The proposed approach is evaluated on the Extended DAIC (eDAIC) dataset, which contains audio recordings of clinical interviews. The model achieves state-of-the-art performance, with an RMSE of 2.92 and a CCC of 0.533 in predicting the PTSD severity score (PCL-C).
Compared to other approaches, the stochastic transformer outperforms traditional machine learning methods and deep learning models without stochastic components. The authors attribute the improved performance to the transformer's ability to capture temporal information in the audio data and the benefits of the stochastic components.
The authors suggest that the proposed approach can help clinicians by providing a more accurate and automated tool for PTSD detection, overcoming the limitations of self-report questionnaires.
Stats
The study reports the following key figures:
RMSE of 2.92 on the eDAIC dataset
CCC of 0.533 on the eDAIC dataset