The article introduces a method for clustering timed sequences, which are data composed of sequences of timestamped events. The authors adapt two existing techniques - the drop-DTW metric and the DBA algorithm for averaging time series - to handle the challenges of timed sequences, such as the lack of a natural vector representation and the need to account for both sequential and temporal aspects.
The drop-DTW metric is extended to measure the distance between timed sequences, allowing for the removal of events and incorporating temporal constraints. The DBA algorithm is then adapted to compute an average timed sequence based on the drop-DTW metric, enabling the use of classical clustering algorithms like hierarchical clustering and K-means.
The proposed methods are evaluated on synthetic data and applied to a real-world use case of analyzing care pathways from electronic health records of patients who underwent pulmonary resection surgery. The results show that the drop-DTW-based clustering can identify more specific and clinically meaningful clusters of care pathways compared to a traditional sequence analysis approach.
The key highlights of the article include:
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Thomas Guyet... at arxiv.org 04-25-2024
https://arxiv.org/pdf/2404.15379.pdfDeeper Inquiries