The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, not withstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Fooling it - Student Attacks on Automatic Short Answer Grading

Author:Anna Filighera, Tim Steuer, Christoph Rensing
Date:September 2020
Kind:In proceedings - use for conference & workshop papers
Publisher:Springer International Publishing
Book title:Addressing Global Challenges and Quality Education
Editor:Carlos Alario-Hoyos, María Jesús Rodríguez-Triana, Maren Scheffel, Inmaculada Arnedillo-Sánchez, Sebastian Maximilian Dennerlein
Keywords:Automatic Short Answer Grading, Adversarial Attacks, Automatic Assessment
Abstract:Modern machine learning approaches have been shown to be vulnerable to adversarial attacks in many fields. This is a critical weakness, especially for models that are expected to function in an adversarial environment, such as automatic grading models in exams. However, as most of these attacks are either limited in their success rate, their applicability in diverse scenarios or require mathematical expertise of the attacker, the question arises to which extent students themselves are even capable of fooling state-of-the-art grading models. This work aims to investigate this question for the short answer question format. For this purpose, we tasked students of an educational technologies university course with probing the state-of-the-art automatic short answer grading model for weaknesses. Of the fourteen active participants, only one reported the model to be sufficiently free of deficits. The following weaknesses were identified by the students: a disregard for negation, no plagiarism detection, correct answers not being predicted as such and oversensitivity to small linguistic changes in answers, triggers, and keywords.
Full paper (pdf)

[Export this entry to BibTeX]