Core Concepts
The authors present algorithms for ε-coresets in k-median clustering under DTW, utilizing sensitivity sampling and approximation techniques. Their approach enables practical solutions with comparable accuracy to state-of-the-art methods.
Abstract
The paper introduces novel algorithms for ε-coresets in k-median clustering under DTW, leveraging sensitivity sampling and approximation methods. It addresses the challenges of handling massive datasets by condensing input sets into problem-specific coresets. The study focuses on dynamic time warping (DTW) distance, a non-metric measure widely used in data mining applications. By adapting existing frameworks to approximate DTW distances, the authors achieve efficient clustering solutions with reduced complexity. The research explores the construction of coresets for the (k, l)-median problem under DTW, providing insights into sensitivity bounds and approximation factors. The analysis highlights the significance of VC dimension in approximating range spaces defined by balls under p-DTW distance. Overall, the study contributes to advancing clustering algorithms for time series data using innovative approximation techniques.
Stats
We achieve our results by investigating approximations of DTW that provide a trade-off between accuracy and amenability to known techniques.
The resulting approximations are the first with polynomial running time and achieve a very similar approximation factor as state-of-the-art techniques.
Our main ingredient is a new insight into the notion of relaxed triangle inequalities for p-DTW.