Evaluating the Instructional Quality of Code Comments Generated by Large Language Models for Novice Programmers
Large Language Models (LLMs) show promise in generating code comments that can support the learning of novice programmers, but their educational effectiveness remains under-evaluated. This study assesses the instructional quality of code comments produced by GPT-4, GPT-3.5-Turbo, and Llama2, compared to expert-developed comments, focusing on their suitability for novice programmers.