Core Concepts
OpenAlex can serve as a reliable alternative to traditional bibliographic databases like Scopus for certain types of bibliometric analyses, particularly at the country level, but additional research is needed to fully understand and address its limitations in metadata accuracy and completeness.
Abstract
This study, conducted in collaboration with the OpenAlex team, compares the coverage and suitability of OpenAlex and Scopus for bibliometric analyses. The key findings are:
Coverage:
OpenAlex contains 168.2M works published between 2000-2022, with the majority (82%) classified as articles.
A substantial number of works in OpenAlex (67%) do not have any references, and 66% do not receive any citations.
OpenAlex indexes 83% of the unique journals in Scopus, with 94% of the "active" Scopus journals matched.
Comparable Analyses:
The proportion of works by Open Access status is largely comparable between OpenAlex and Scopus, except for the Hybrid category where OpenAlex identifies 81% more works.
The number of works classified as "articles" is very similar between the two databases, but differences are observed for other document types.
Country-level analyses show a high degree of correlation (Spearman ρ > 0.95) between OpenAlex and Scopus.
Citation counts per country are lower in OpenAlex compared to Scopus, likely due to the high proportion of works without indexed references.
Language-level analyses show the lowest correlations, potentially due to differences in language detection approaches.
Limitations and Future Work:
Additional research is needed to fully understand and address issues related to metadata accuracy and completeness, particularly for fields such as author affiliations, references, citations, and language detection.
Expanding coverage and improving metadata quality for non-article document types would further enhance the usefulness of OpenAlex.
The community is encouraged to systematically assess the recall and precision of key metadata fields in OpenAlex, as well as explore the characteristics of the scholarship found only in OpenAlex.
Overall, this study provides evidence that OpenAlex can be a reliable alternative to traditional databases for certain types of bibliometric analyses, while also highlighting areas that require further investigation and improvement.
Stats
168.2M works in OpenAlex published between 2000-2022, with a peak of 10.2M in 2020.
67% of works in OpenAlex do not have any references, and 66% do not receive any citations.
83% of the unique journals in Scopus can be matched to a source in OpenAlex, and 94% of the "active" Scopus journals are matched.
Scopus has more citations per country compared to the subset of works in OpenAlex that are also in Scopus.
Quotes
"OpenAlex is a superset of Scopus and can be a reliable alternative for some analyses, particularly at the country level."
"Additional research is needed to fully comprehend and address OpenAlex's limitations in metadata accuracy and completeness."