Alapfogalmak
Predicting the most relevant United Nations Sustainable Development Goals (SDGs) for university courses using large language models (LLMs) and fine-tuned smaller foundation models.
Kivonat
The researchers present a novel approach to predicting the United Nations Sustainable Development Goals (SDGs) relevant to university courses. They use the large language model PaLM 2 to generate training data by providing course descriptions as input and obtaining the top SDG goals predicted by the LLM. This generated data is then used to fine-tune several smaller foundation models, including BERT, mBERT, RoBERTa, XLM-RoBERTa, and BART, for the multi-label SDG prediction task.
The dataset consists of 2,125 English course descriptions from Metropolia University of Applied Sciences, spanning the years 2021-2023. After preprocessing and cleaning the data, the researchers use the LLM to generate the SDG labels for each course, which are then encoded into a binary format for the training of the smaller models.
The performance of the models is evaluated using precision, recall, and F1-score. The results show that the BART model achieves the highest F1-score of 0.786, outperforming the other fine-tuned models. The researchers also analyze the performance of the models across individual SDGs, identifying areas where certain models excel and highlighting the need for addressing data imbalances to improve generalization.
This research contributes to the integration of SDGs in higher education by providing a practical methodology for universities to assess the alignment of their course offerings with the UN's sustainable development goals. The findings open up avenues for further research and implementation of similar approaches to foster sustainable practices in academic institutions worldwide.
Statisztikák
The dataset consists of 2,125 English course descriptions from Metropolia University of Applied Sciences, spanning the years 2021-2023.
The dataset was split 70:15:15 into training, validation, and testing subsets.
Idézetek
"The best performing model in our experiments was BART with an F1-score of 0.786."
"The nuanced performance variations across the models underscore the significance of model selection tailored to specific NLP tasks' requirements."