Core Concepts
We devise coresets for kernel k-Means and the more general kernel (k,z)-Clustering problems, which significantly improve upon previous results in terms of coreset size and construction time. Our coresets have size poly(kϵ^-1) and can be constructed in near-linear time, enabling efficient algorithms for kernel clustering.
Abstract
The paper addresses the computational challenges of kernel k-Means, which has superior clustering capability compared to classical k-Means but introduces significant computational overhead. To tackle this, the authors adapt the notion of coresets to kernel clustering.
Key highlights:
- The authors devise a coreset for kernel (k,z)-Clustering that works for a general kernel function and has size poly(kϵ^-1), vastly improving upon previous results.
- The coreset can be constructed in near-linear time, ˜O(nk), using a black-box application of recent coreset constructions for Euclidean spaces.
- The authors show that their coreset implies new efficient algorithms for kernel k-Means, including a (1+ϵ)-approximation in time near-linear in n, and a streaming algorithm using space and update time poly(kϵ^-1 log n).
- Experimental results validate the efficiency and accuracy of the coresets, and demonstrate significant speedups in applications like kernel k-Means++ and spectral clustering.
Stats
The paper does not contain any explicit numerical data or statistics to support the key claims. The focus is on the theoretical construction and guarantees of the coresets.