Core Concepts
The author introduces a novel method that integrates both kernel correlation and dissimilarity to enhance clustering accuracy, emphasizing the coherence between them for improved performance.
Abstract
The content discusses the importance of combining kernel correlation and dissimilarity in Multiple Kernel k-Means (MKKM) clustering. It highlights the limitations of relying solely on one metric and proposes a new approach to improve clustering precision by integrating both aspects. The study evaluates the proposed method on various benchmark datasets, showcasing its superiority over existing techniques.
The introduction emphasizes the significance of clustering in machine learning and data mining, focusing on k-means clustering as a widely adopted algorithm. Various extensions to k-means are discussed, highlighting challenges with linearly non-separable data.
Deep clustering strategies are explored as effective solutions for capturing nonlinear structures within unsupervised data. The limitations of deep clustering, such as interpretability issues and high computational complexity, are acknowledged.
Kernel k-means (KKM) clustering is introduced as an alternative strategy known for capturing intricate structures by mapping data points to high-dimensional feature spaces. The construction of kernel matrices from inner products is explained.
Multiple Kernel Clustering (MKC) methods are discussed as innovative approaches that generate multiple kernels to extract comprehensive information from diverse datasets. Challenges faced by KKM-based methods due to diverse real-world datasets are addressed.
Various MKC techniques are presented to enhance clustering algorithm performance through kernel fusion methods like MKKM and multiple kernel spectral clustering.
The paper proposes a novel approach that integrates both kernel correlation and dissimilarity into the MKKM model to improve clustering accuracy. The optimization process involving H and Y is detailed along with convergence considerations.
Experimental results on 13 benchmark datasets demonstrate the effectiveness of the proposed method compared to existing algorithms in terms of accuracy, normalized mutual information, purity, and adjusted Rand index metrics.
Stats
0.5320±0.0053
0.8222±0.0000
0.7550±0.0396
0.8000±0.0000
0.8333±0.0000
0.7630±0.0000
1: Introduction Clustering is common practice in ML & DM.
2: Extensions like deep k-means address non-separable data.
3: Kernel k-means maps data points for separability.
4: MKC liberates from single fixed kernels for better info extraction.
5: Various MKC techniques developed for improved performance.
6: Proposed method integrates kernel correlation & dissimilarity.
7: Optimization involves H & Y alternation for local optima.
8: Experimental results show superior performance over existing methods.