Core Concepts
A simple yet effective masked Transformer method significantly outperforms recent state-of-the-art algorithms in electrocardiogram classification tasks.
Abstract
The paper presents a novel masked Transformer method, called MTECG, for efficient electrocardiogram (ECG) classification. The key highlights are:
Adaptation of the image-based masked autoencoder approach to self-supervised representation learning from ECG time series data. The ECG signal is split into a sequence of non-overlapping segments, and learnable positional embeddings are used to preserve the sequential information.
Construction of a comprehensive ECG dataset, Fuwai, comprising 220,251 recordings with 28 common ECG diagnoses annotated by medical experts, which significantly surpasses the sample size of publicly available ECG datasets.
Exploration of a strong pre-training and fine-tuning recipe, including the masking ratio, training schedule length, fluctuated reconstruction target, layer-wise learning rate decay, and DropPath rate, to effectively train the model.
Extensive evaluation on both private and public ECG datasets, demonstrating that the proposed MTECG method outperforms recent state-of-the-art algorithms by a significant margin, increasing the macro F1 scores by 3.4%-27.5% on the Fuwai dataset, 9.9%-32.0% on the PTB-XL dataset, and 9.4%-39.1% on a multicenter dataset.
The findings suggest that the proposed masked Transformer approach can effectively leverage the unique temporal and spatial structure of ECG data, and the derived lightweight model offers deployment-friendly features attractive for clinical applications.
Stats
The Fuwai dataset consists of 220,251 ECG recordings from 173,951 adult patients, encompassing 28 different diagnoses annotated by medical experts.
The PTB-XL dataset includes 21,836 ECG recordings with 22 labels.
The PCinC dataset contains 79,574 ECG recordings with 25 labels, collected from 8 public datasets.
Quotes
"The experiments demonstrate that the proposed method increases the macro F1 scores by 3.4%-27.5% on the Fuwai dataset, 9.9%-32.0% on the PTB-XL dataset, and 9.4%-39.1% on a multicenter dataset, compared to the alternative methods."
"We hope that this study could direct future research on the application of Transformer to more ECG tasks."