Grading Code in Higher Education: Challenges and Opportunities ✏️

The Current Landscape

In today's digital-first educational environment, professors and teaching assistants face unique challenges when it comes to grading coding assignments. Despite the widespread adoption of Learning Management Systems (LMS) , these platforms often fall short when it comes to evaluating programming work effectively.

The Manual Burden

One of the most striking findings from our research is the sheer time investment required for grading. Professors report spending up to two hours per assignment, meticulously downloading files, running tests, and reviewing code line by line. This process becomes particularly daunting when dealing with large class sizes.

A College Boreal instructor noted that reviewing a single assignment with 1,000 lines of code can take up to an hour, and that’s just one student out of 60+.

Automated Solutions: Promise and Limitations

While some institutions have developed in-house automated grading tools, these solutions have significant limitations. They can handle basic functionality testing but struggle with evaluating:

  • Code structure and organization

  • Design patterns

  • Student effort and understanding

  • Partial credit scenarios

The AI Challenge

The emergence of AI tools has added another layer of complexity to the grading process. Instructors are now tasked with not only evaluating code quality but also determining whether students used AI inappropriately. Current detection tools are limited, making this process largely manual and time-intensive.

Sometimes we can spot AI use during demos - we ask the students questions and if there is any plagiarism we catch it at that time.

Time Constraints and Quality Feedback

Teaching Assistants face particularly strict time constraints, often limited to spending no more than 5 minutes per assignment. This limitation makes it challenging to provide the kind of detailed, constructive feedback that helps students learn and improve.

Looking Forward

The educational community is clearly calling for more sophisticated tools that can:

  • Automate routine grading tasks while preserving human oversight

  • Provide meaningful feedback at scale

  • Better detect and handle AI-generated code

  • Support partial credit scenarios

  • Evaluate coding style and effort

As we move forward, the goal isn't to completely automate the grading process but to find the right balance between efficiency and educational value. The ideal solution would free up instructors' time while ensuring students receive the guidance they need to develop strong programming skills.

One professor summed it up perfectly: “There’s lots of room with what we’re seeing in the generative AI space to make tooling that could do some preliminary analysis of a coding assignment.”
Next
Next

AI vs Manual Grading: A Practical Comparison for Educators