Core Concepts
UMAP's fundamental assumptions and techniques have a natural interpretation via Information Geometry.
Abstract
In this comprehensive analysis, the connection between UMAP and Information Geometry is explored. UMAP, initially rooted in Category Theory, is revealed to have a geometric interpretation. The algorithm aims to embed high-dimensional data into a lower-dimensional space while preserving proximity. Key steps include conformal rescaling, defining edge probabilities based on distance metrics, symmetrization of weights, and cross-entropy minimization. The implementation may differ from theoretical claims due to sampling strategies. Uniform distribution assumptions on Riemannian manifolds are crucial for accurate embeddings. Different probability kernels impact clustering results across datasets like Iris, MNIST, and Fashion MNIST. The equivalence of cross-entropy and KL-divergence in learning dynamics is highlighted. Future research directions involve exploring Vietoris-Rips complexes for capturing hidden structures in data.
Stats
pi|j = exp(-d(Xi, Xj) - ρi / σi)
wl(e) = (1 + a∥yi − yj∥2b / 2 )^-1
H(X, Y ) = -X wh(e) log wl(e) + (1 - wh(e)) log(1 - wl(e))
Quotes
"In essence, by creating a custom distance in the neighborhood of each Xi we can ensure the validity of the assumption of uniform distribution on the manifold."
"Symmetrisation is necessary since UMAP needs to adjust the rescaled metrics on Bi’s: the degree of belief of the edge i ∼ j may not be equal to the degree of belief of j ∼ i."
"The Kullback–Leibner divergence and the cross–entropy loss functions induce the same learning dynamics for lower–dimensional similarities."