Sign In

Using OpenAlex to Create Meaningful Global Overlay Maps of Science at the Individual and Institutional Levels

Core Concepts
This study introduces the use of OpenAlex data to create global science overlay maps that can be overlaid with data from individual researchers or research institutions to visualize their positioning within the broader scientific landscape.
This study proposes a procedure for creating global overlay maps of science using the freely available OpenAlex database. The authors provide six different global base maps covering different time periods that can be used as a foundation for overlaying specific data. The authors demonstrate the overlay technique by creating example maps for an individual researcher (the first author) and a research institution (the Max Planck Institute for Solid State Research). The raw overlay maps highlight general research concepts, while the normalized overlay maps emphasize more specialized concepts. The authors discuss the advantages of their approach, such as the free availability of the underlying data and the ability to explore the research activities of various units. They also acknowledge limitations, including potential issues with the concept assignments in OpenAlex and the normalization method used. The study concludes that the visualization approach based on OpenAlex data can be a useful tool for exploring the research activities of various science units. The authors suggest potential extensions, such as incorporating additional overlay data like publication quality and citation impact.
The OpenAlex database used in this study contains 243,053,925 documents. The authors used the following time periods for their analyses: 1800-2022 with 237,876,541 documents 2008-2022 with 134,092,007 documents 2013-2022 with 95,438,459 documents 2018-2022 with 47,665,990 documents 2022 with 8,496,167 documents
"The visualization approach based on OpenAlex data that we introduced in this study can be used for any science unit." "Normalization of the overlay maps compares the focal unit's activity to the world's activity. Thereby, the higher level concepts loose prominence compared to lower level concepts."

Deeper Inquiries

How could the normalization method be further improved to better account for differences in scientific fields and hierarchical levels of concepts?

In order to enhance the normalization method to better accommodate variations in scientific fields and hierarchical levels of concepts, several adjustments can be considered: Field-specific Normalization: Instead of comparing all concepts regardless of their scientific field, a more refined approach could involve normalizing the overlay data based on the specific scientific field of each concept. This would ensure that concepts within the same field are compared against each other, providing a more accurate representation of the research focus. Hierarchical Level Consideration: Taking into account the hierarchical levels of concepts is crucial for a more nuanced normalization process. Assigning weights or scaling factors based on the hierarchical level of each concept can help in appropriately adjusting the prominence of concepts in the overlay maps. Thresholds for Concept Visibility: Implementing thresholds for the number of papers per concept can prevent concepts with minimal representation from dominating the visualization. Setting a minimum threshold for concept visibility can ensure that only significant concepts are prominently displayed in the overlay maps. Field-specific Weighting: Assigning different weights to concepts based on their scientific field can help in balancing the normalization process. Concepts from highly specialized fields may require different normalization factors compared to more general concepts, ensuring a more accurate representation of research areas. By incorporating these refinements, the normalization method can be tailored to better capture the nuances of scientific fields and hierarchical levels, leading to more precise and insightful global overlay maps.

How could the reliability and validity of the concept assignment process in OpenAlex be further enhanced to improve the accuracy of the overlay maps?

To improve the reliability and validity of the concept assignment process in OpenAlex and enhance the accuracy of the overlay maps, the following strategies can be implemented: Refinement of Concept Assignment Algorithms: Continuous refinement and optimization of the algorithms used for concept assignment can help in reducing errors and inaccuracies. Regular updates and improvements to the algorithms based on feedback and validation studies can enhance the reliability of concept assignments. Validation through Expert Review: Implementing a validation process where concept assignments are reviewed by domain experts can help in identifying and correcting misclassifications. Expert validation can ensure that concepts are accurately assigned to publications based on their actual content and research focus. Feedback Mechanism for Users: Establishing a feedback mechanism where users can report inaccuracies or inconsistencies in concept assignments can aid in identifying problematic areas. User feedback can serve as valuable input for refining the concept assignment process and improving overall accuracy. Integration of Machine Learning Techniques: Leveraging machine learning techniques to automate and optimize the concept assignment process can lead to more accurate results. Training machine learning models on a diverse set of publications and concepts can improve the precision of concept assignments in OpenAlex. Regular Quality Checks: Conducting regular quality checks and audits of the concept assignment process can help in identifying and rectifying any issues promptly. Implementing quality assurance measures as part of the concept assignment workflow can ensure the reliability and validity of the assigned concepts. By implementing these strategies, OpenAlex can enhance the reliability and validity of the concept assignment process, ultimately improving the accuracy of the overlay maps generated using the platform.