核心概念
Graders can improve accuracy by grading similar submissions consecutively.
要約
1. Introduction
Programming problems common in exams.
Challenges in grading due to diverse student submissions.
Importance of human grading for free-response questions.
2. Natural Grading Error
Quantifying inconsistencies in historical grading sessions.
Troubling inconsistencies in grades assigned by graders.
Linear relationship between submission similarity and grading error.
3. Methods
Generating program embeddings for all student submissions.
Hypothesizing that similarity influences grader accuracy.
Introducing algorithms to assist human grading.
4. Experimental Results
Embeddings show meaningful similarity scores.
Graders score more accurately with similar submissions.
Algorithms improve accuracy over random baseline.
5. Discussion
Proposed algorithms enhance grading accuracy.
Cluster algorithm yields lowest grading error, while snake algorithm has lowest validation distance.
Petal algorithm offers a balanced trade-off between the two.
統計
グレーダーが学生の提出物にスコアを割り当てる際、以前に似た提出物を見た場合、より正確にスコアリングできる可能性があるという仮説を検証しました。