Grading Code in Higher Education: Challenges and Opportunities ✏️
The Current Landscape
In today's digital-first educational environment, professors and teaching assistants face unique challenges when it comes to grading coding assignments. Despite the widespread adoption of Learning Management Systems (LMS) , these platforms often fall short when it comes to evaluating programming work effectively.
The Manual Burden
One of the most striking findings from our research is the sheer time investment required for grading. Professors report spending up to two hours per assignment, meticulously downloading files, running tests, and reviewing code line by line. This process becomes particularly daunting when dealing with large class sizes.
“A College Boreal instructor noted that reviewing a single assignment with 1,000 lines of code can take up to an hour, and that’s just one student out of 60+.”
Automated Solutions: Promise and Limitations
While some institutions have developed in-house automated grading tools, these solutions have significant limitations. They can handle basic functionality testing but struggle with evaluating:
Code structure and organization
Design patterns
Student effort and understanding
Partial credit scenarios
The AI Challenge
The emergence of AI tools has added another layer of complexity to the grading process. Instructors are now tasked with not only evaluating code quality but also determining whether students used AI inappropriately. Current detection tools are limited, making this process largely manual and time-intensive.
“Sometimes we can spot AI use during demos - we ask the students questions and if there is any plagiarism we catch it at that time.”
Time Constraints and Quality Feedback
Teaching Assistants face particularly strict time constraints, often limited to spending no more than 5 minutes per assignment. This limitation makes it challenging to provide the kind of detailed, constructive feedback that helps students learn and improve.
Looking Forward
The educational community is clearly calling for more sophisticated tools that can:
Automate routine grading tasks while preserving human oversight
Provide meaningful feedback at scale
Better detect and handle AI-generated code
Support partial credit scenarios
Evaluate coding style and effort
As we move forward, the goal isn't to completely automate the grading process but to find the right balance between efficiency and educational value. The ideal solution would free up instructors' time while ensuring students receive the guidance they need to develop strong programming skills.
“One professor summed it up perfectly: “There’s lots of room with what we’re seeing in the generative AI space to make tooling that could do some preliminary analysis of a coding assignment.””