Core Concepts
Integrating large language models like GPT-3.5-Turbo as AI tutors within automated programming assessment systems can offer timely feedback and scalability, but also faces challenges like generic responses and student concerns about learning progress inhibition.
Abstract
This study explores the integration of a large language model, specifically OpenAI's GPT-3.5-Turbo, as an AI tutor within the Artemis automated programming assessment system (APAS). Through a combination of empirical data collection and an exploratory survey, the researchers identified two main user personas:
Continuous Feedback - Iterative Ivy: Students who relied heavily on the AI tutor's feedback before submitting their final solutions to the APAS. This group used the AI tutor to guide their understanding and iteratively refine their code.
Alternating Feedback - Hybrid Harry: Students who alternated between seeking AI tutor feedback and submitting their solutions to the APAS. This group adopted a more iterative, trial-and-error approach to problem-solving.
The findings highlight both advantages and challenges of the AI tutor integration. Advantages include timely feedback and scalability, but challenges include generic responses, lack of interactivity, operational dependencies, and student concerns about over-reliance and learning progress inhibition. The researchers also identified instances where the AI tutor revealed solutions or provided inaccurate feedback.
Overall, the study demonstrates the potential of large language models as AI tutors in programming education, but also underscores the need for further refinement to address the identified limitations and ensure an optimal learning experience for students.
Stats
The AI tutor was able to recognize and provide feedback on logical and semantic issues in student code, such as incorrect loop termination conditions.
66.6% of the AI tutor's feedback was categorized as useful, while 26.6% was not useful, including 3 instances where the solution was revealed and 4 cases of hallucinations.
Quotes
"The AI-Tutor's responses were perceived as too generic. Students preferred more context-specific feedback pointing directly to improvement areas in the code."
"Students expressed the wish for enhanced interactive capabilities with the AI-Tutor, such as the ability to ask follow-up questions after initial feedback."
"Some students feared that using the AI-Tutor might lead to over-reliance which would slow down their learning progress."