Academic

Using LLMs for Knowledge Component-level Correctness Labeling in Open-ended Coding Problems

arXiv:2602.17542v1 Announce Type: new Abstract: Fine-grained skill representations, commonly referred to as knowledge components (KCs), are fundamental to many approaches in student modeling and learning analytics. However, KC-level correctness labels are rarely available in real-world datasets, especially for open-ended programming tasks where solutions typically involve multiple KCs simultaneously. Simply propagating problem-level correctness to all associated KCs obscures partial mastery and often leads to poorly fitted learning curves. To address this challenge, we propose an automated framework that leverages large language models (LLMs) to label KC-level correctness directly from student-written code. Our method assesses whether each KC is correctly applied and further introduces a temporal context-aware Code-KC mapping mechanism to better align KCs with individual student code. We evaluate the resulting KC-level correctness labels in terms of learning curve fit and predictive p

Zhangqi Duan, Arnav Kankaria, Dhruv Kartik, Andrew Lan · February 21, 2026 · 1 min read · 6 views

#cs.CL #cs.CY

Executive Summary

This article proposes an innovative framework that leverages large language models (LLMs) to automate the labeling of knowledge component-level correctness in open-ended coding problems. The framework assesses the correct application of each knowledge component and introduces a temporal context-aware Code-KC mapping mechanism to better align components with individual student code. Experimental results demonstrate improved learning curve fit and predictive performance compared to baselines, with human evaluation showing substantial agreement between LLM and expert annotations. This breakthrough has significant implications for student modeling and learning analytics, enabling more accurate and nuanced assessments of student learning.

Key Points

▸ Proposes an automated framework for KC-level correctness labeling using LLMs
▸ Introduces a temporal context-aware Code-KC mapping mechanism
▸ Demonstrates improved learning curve fit and predictive performance
▸ Achieves substantial agreement between LLM and expert annotations

Merits

Strength in Methodology

The use of LLMs to automate KC-level correctness labeling is a novel and effective approach, leveraging the power of large language models to improve the accuracy and efficiency of student modeling and learning analytics.

Demerits

Limitation in Generalizability

The framework's effectiveness may be limited to specific domains or tasks, and further research is needed to explore its generalizability to other areas of education and learning analytics.

Expert Commentary

The article presents a significant contribution to the field of student modeling and learning analytics, offering a novel and effective approach to automating KC-level correctness labeling. The use of LLMs demonstrates the power of large language models in improving the accuracy and efficiency of student modeling and learning analytics. However, further research is needed to explore the framework's generalizability and scalability to other domains and tasks. The implications of this work are substantial, with potential applications in education policy and practice. As the field continues to evolve, it will be essential to consider the intersection of human expertise and automated methods, ensuring that both are leveraged to advance our understanding of student learning and behavior.

Recommendations

✓ Future research should aim to explore the framework's generalizability and scalability to other domains and tasks, and to develop more robust and widely applicable methods for KC-level correctness labeling.
✓ The use of LLMs for KC-level correctness labeling should be further investigated in various educational settings, with a focus on the practical applications and policy implications of this approach.

Sources

arXiv - cs.CL

Something extraordinary is coming.

Using LLMs for Knowledge Component-level Correctness Labeling in Open-ended Coding Problems

AI Commentary

Executive Summary

Key Points

Merits

Strength in Methodology

Demerits

Limitation in Generalizability

Expert Commentary

Recommendations

Sources

Related Articles

Uncovering Context Reliance in Unstructured Knowledge Editing

Using AI in Dance Notation and Copyright Infringement Prevention: Enhancing …

Multilevel Determinants of Overweight and Obesity Among U.S. Children Aged …

An artificial intelligence framework for end-to-end rare disease phenotyping from …

JCG, PC

HSOLLC Co., Ltd.