Breaking Point: The End of Unsupervised Programming Assessments
- sasha97518
- 2 days ago
- 3 min read
Dr. Ghulam Mubashar Hassan from the University of Western Australia discusses the changes that occurred to short-term assessment security options in programming assessments in the past year.
05 June 2025
I have been teaching fundamental programming courses for over a decade. On November 30, 2022, the release of ChatGPT, a Generative Artificial Intelligence (GenAI) system, marked a significant milestone in AI development. As a computer scientist and AI researcher, I was excited by this advancement. However, from an academic perspective, it raised serious concerns about the integrity of assessments in programming education.
These concerns led to an award-winning research study examining the impact of GenAI on assessment integrity. This collaborative effort involved nine researchers from various engineering disciplines across multiple Australian universities. From the study, we discovered that there were unsecured/unsupervised programming assessments resistant to GenAI. A follow-up study was conducted a year later to evaluate the evolving influence of improved GenAI tools on assessment validity. We found there were still unsecured programming assessments resistant to GenAI. However, these were clearly identified as short-term options.
From a software engineering perspective, both studies revealed that GenAI tools could easily solve low- to medium-difficulty programming tasks. However, their effectiveness diminished significantly when the assessments required processing external data files—CSVs, images, etc.—and demanded computational thinking beyond basic coding. These findings affirmed the continued relevance and robustness of large-scale projects (e.g., over 150 lines of code analysing real datasets) in upholding academic integrity in programming assessments.
"Any student with enough prompting skill or access to the right GenAI tools can now complete any unsecured/unsupervised programming-based assessment"
Over the past year, the competition among GenAI platforms has intensified. Specialized coding-focused tools like ChatGPT with Python support, IBM's WatsonX, and Amazon Q Developer have emerged. These platforms better interpret external data and generate more generalized, adaptable code. Prompt engineering has also become more accessible, further enhancing the user experience. Unfortunately, these advancements have dealt a significant blow to the integrity of programming assessments, as it is now virtually impossible to verify if submitted code was produced independently by students. My team has explored many tactics, but any student with enough prompting skill or access to the right GenAI tools can now complete any unsecured/unsupervised programming-based assessment. The only option to check understanding and capability is via secure/supervised assessments.
This has created a profound dilemma in computing education. Under current university resource constraints, it is increasingly difficult to assess students’ programming and computational thinking skills through take-home assignments in introductory courses. Consequently, there is growing reliance on invigilated assessments to gauge genuine learning, effectively a regression to traditional exam-centric evaluation models.
In my view, there is an urgent need to reimagine the education system to integrate GenAI into teaching and learning processes. Doing so will require a redefinition of educational goals, methods, and structures, potentially rendering existing paradigms obsolete. My team is currently working on this, and I look forward to sharing new ideas in the future.
With the rapid evolution of computing technologies, this transformation must occur swiftly. Otherwise, the gap between industry expectations and academic preparation will continue to widen, risking the obsolescence of current educational frameworks. This disconnect is already visible, as major tech companies like Google shift away from conventional hiring models, opting instead for skills-based recruitment that bypasses formal degrees altogether.
Table 10, in our 2024 study, provides short and long-term assessment security and integration opportunities. While many of the short-term solutions have become obsolete, readers should take note of the long-term options that remain viable. The supplementary materials also provide procedures on conducting your own risk assessment.
Note: This article is improved via GenAI tools. “If you know the difference between correct and incorrect outcome, then GenAI tools are highly productive and efficient”
Dr. Ghulam Mubashar Hassan
The University of Western Australia

Commenti