Clustering research publications efficiently requires normalization of direct citation relations. This study evaluates six approaches to normalization, including unnormalized, fractional, geometric mean, geometric mean-limitN, directional-fractional, and directional-geometric. The results show that the fractional approach is commonly used but leads to inaccurate assignments due to high normalized relatedness for publications with few relations. The geometric approach performs better in reducing inaccurate assignments. Data from four datasets were analyzed using the Leiden algorithm with different resolution parameters. Evaluation measures included Adjusted Rand Index (ARI), Silhouette width, and a new measure called probably inaccurate assignments (PIA). The study highlights the importance of proper normalization for clustering quality.
To Another Language
from source content
arxiv.org
Deeper Inquiries