Hostname: page-component-745bb68f8f-mzp66 Total loading time: 0 Render date: 2025-01-15T12:39:55.557Z Has data issue: false hasContentIssue false

Computer-assisted assessment of free-text answers

Published online by Cambridge University Press:  01 December 2009

Diana Pérez-Marín*
Affiliation:
Language and Computer Systems I Department, Computer Science Faculty, Office 2025, Ampliación del Rectorado Building, Tulipán Street, 28933 Móstoles, Universidad Rey Juan Carlos, Madrid, Spain; e-mail: diana.perez@urjc.es
Ismael Pascual-Nieto*
Affiliation:
Computer Science Department of the Universidad Autónoma of Madrid, Calle Francisco Tomás y Valiente, 11, Cantoblanco 28049, Madrid, Spain
Pilar Rodríguez*
Affiliation:
Computer Science Department of the Universidad Autónoma of Madrid, Calle Francisco Tomás y Valiente, 11, Cantoblanco 28049, Madrid, Spain

Abstract

The automatic assessment of students’ free-text answers has recently received much attention, due to the necessity of exploring and taking advantage of new and more complex computer-based assessment methods. In this paper, a review of the state-of-art of the field is presented, focusing on the techniques that underpin these systems and their evaluation metrics. Although there is still a long way to go so as to reach the ideal system, the fact that the existing systems are already being used commercially and as a second opinion in exams such as GMAT proves the uptake of this field.

Type
Articles
Copyright
Copyright © Cambridge University Press 2009

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Alfonseca, E., Carro, R., Freire, M., Ortigosa, A., Pérez, D. 2004. Educational adaptive hypermedia meets computer assisted assessment. In Proceedings of the International Workshop of Educational Adaptive Hypermedia, collocated with the Adaptive Hypermedia (AH) Conference, Eindhoven, The Netherlands.Google Scholar
Birenbaum, M., Tatsuoka, K., Gutvirtz, Y. 1992. Effects of response format on diagnostic assessment of scholastic achievement. Applied Psychological Measurement 14(4), 353363.CrossRefGoogle Scholar
Blayney, P., Freeman, M. 2003. Automated marking of individualised spreadsheet assignments: the impact of different formative self-assessment options. In Proceedings of the 7th Computer Assisted Assessment Conference, Loughborough, UK.Google Scholar
Bloom, B. 1956. Taxonomy of educational objectives: the classification of educational goals. Handbook I, Cognitive Domain. Longman, Whiteplains (New York); Toronto.Google Scholar
Burstein, J., Kukich, K., Wolff, S., Lu, C., Chodorow, M., Bradenharder, L., Harris, M. D. 1998. Automated scoring using a hybrid feature identification technique. In Proceedings of the Annual Meeting of the Association of Computational Linguistics, The Association of Computational Linguistics, Montreal, Quebec, Canada.CrossRefGoogle Scholar
Burstein, J., Leacock, C., Swartz, R. 2001. Automated evaluation of essays and short answers. In Proceedings of the 5th International Computer Asssited Assessment Conference, Loughborough, UK.Google Scholar
Callear, D., Jerrams-Smith, J., Soh, V. 2001. CAA of short non-MCQ answers. In Proccedings of the 5th International Computer Assissted Assessment conference, Loughborough, UK.Google Scholar
Christie, J. 1999. Automated essay marking—for both style and content. In Proceedings of the 3rd Computer Assisted Assessment International Conference, Loughborough, UK.Google Scholar
Christie, J. 2003. Automated essay marking for content—does it work? In Proceedings of the 7th International Computer Assisted Assessment Conference, Loughborough, UK.Google Scholar
Chung, G., O’Neill, H. 1997. Methodological Approaches to Online Scoring of Essays. Technical Report 461, UCLA, National Center for Research on Evaluation, Student Standards, and Testing, USA.Google Scholar
Cucchiarelli, A., Faggioli, E., Velardi, P. 2000. Will very large corpora play for semantic disambiguation the role that massive computing power is playing for other AI-hard problems? In Proceedings of the 2nd Conference on Language Resources and Evaluation, Greece.Google Scholar
Datar, A., Doddapaneni, N., Khanna, S., Kodali, V., Yadav, A. 2004. EGALEssay Grading and Analysis Logic, SourceForge project. http://egal.sourceforge.netGoogle Scholar
Darus, S., Hussin, S., Stapa, S. 2001. Students’ expectations of a computer-based essay marking system. In Reflections, Visions and Dreams of Practice: Selected papers from the IEC 2001 International Education Conference, Malaysia, 197–204.Google Scholar
Darus, S., Stapa, S. 2001. Lecturers’ expectations of a computer-based essay marking systems. Journal of the Malaysian English Language Teachers’ Association (MELTA) 30, 4756.Google Scholar
Deerwester, S. C., Dumais, S. T., Landauer, T. K., Furnas, G. W., Harshman, R. A. 1990. Indexing by latent semantic analysis. Journal of the American Society of Information Science 41(6), 391407.Google Scholar
Denton, P. 2003. Evaluation of the ‘electronic feedback’ marking assistant and analysis of a novel collusion detection facility. In Proceedings of the 7th Computer Assisted Assessment Conference, Loughborough, UK.Google Scholar
Dessus, P., Lemaire, B., Vernier, A. 2000. Free text assessment in a virtual campus. In Proceedings of the 3rd International Conference on Human System Learning, Paris, France, 61–75.Google Scholar
Foltz, P., Laham, D., Landauer, T. 1999. The intelligent essay assessor: Applications to educational technology. Interactive Multimedia Electronic Journal of Computer-Enhanced Learning 1(2). Available online at http://imej.wfu.edu/articles/1999/2/04/index.aspGoogle Scholar
Ishioka, T., Kameda, M. 2004. Automated Japanese Essay Scoring System: JESS. In Proceedings of the 15th International Workshop on Database and Expert Systems Applications, 4–8.Google Scholar
Kakkonen, T., Myller, N., Timonen, J., Sutinen, E. 2005. Automatic Essay Grading with Probabilistic Latent Semantic Analysis. In Proceedings of the 2nd Workshop on Building Educational Applications Using NLP, Association for Computational Linguistics, 29–36.Google Scholar
Kintsch, E., Steinhart, D., Stahl, G., the LSA Research Group 2000. Developing summarization skills through the use of LSA-based feedback. Interactive Learning Environments 8, 87109.Google Scholar
Landauer, T., Dumais, S. 1997. A solution to Plato’s problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review 104, 211240.Google Scholar
Landauer, T., Laham, D., Rehder, B., Schreiner, M. 1997. How well can passage meaning be derived without using word order? A comparison of Latent Semantic Analysis and humans. In Proceedings of the 19th Annual Meeting of the Cognitive Science Society, Erlbaum, Mawhwah, New Jersey, 412–417.Google Scholar
Larkey, L. S. 1998. Automatic essay grading using text categorization techniques. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM Press, New York, 90–95.Google Scholar
Leacock, C. 2004. Scoring free-responses automatically: A case study of a large-scale assessment. English version of Leacock, C. 2004. Automatisch beoordelen van antwoorden op open vragen; een taalkundige benadering. Examens Journal 1(3).Google Scholar
Lutticke, R. 2005. Graphic and NLP Based Assessment of Knowledge about Semantic Networks. In Proceedings of the Artificial Intelligence in Education conference. IOS Press.Google Scholar
Malatesta, K., Wiemer-Hastings, P., Robertson, J. 2002. Beyond the short answer question with research methods tutor. In Proceedings of the Intelligent Tutoring Systems Conference, Lecture Notes in Computer Science 2363. Springer; San Sebastian.Google Scholar
Manning, C., Schutze, H. 2001. Foundations of Statistical Natural Language Processing. MIT Press.Google Scholar
Marcu, D. 2000. The Theory and Practice of Discourse Parsing and Summarization. The MIT Press.Google Scholar
Marshall, S., Barron, C. 1987. Marc-methodical assessment of reports by computer. System 15(2), 161167.Google Scholar
Mason, O., Grove-Stephenson, I. 2002. Automated free text marking with paperless school. In Proceedings of the 6th International Computer Assisted Assessment Conference, Loughborough, UK.Google Scholar
Mcgrath, P. 2003. Assessing students: Computer simulation vs MCQs. In Proceedings of the 7th Computer Assisted Assessment Conference, Loughborough, UK.Google Scholar
Mikhailov, A. 1998. Indextron. Intelligent Engineering Systems Through Artificial Neural Networks 8, 5767.Google Scholar
Ming, Y., Mikhailov, A., Kuan, T. 2000. Intelligent essay marking system. In Learners Together, Cheers, C. (ed.). NGEE ANN Polytechnic.Google Scholar
Mitchell, T., Aldridge, N., Williamson, W., Broomhead, P. 2003. Computer based testing of medial knowledge. In Proceedings of the 7th Computer Assisted Assessment Conference, Loughborough, UK.Google Scholar
Mitchell, T., Russell, T., Broomhead, P., Aldridge, N. 2002. Towards robust computerised marking of free-text responses. In Proceedings of the 6th Computer Assisted Assessment Conference, Loughborough, UK.Google Scholar
MUC7. 1998. Proceedings of the 7th Message Understanding Conference (MUC-7). Morgan Kaufmann, California, USA.Google Scholar
Page, E. 1966. The imminence of grading essays by computer. Phi Delta Kappan 47, 238243.Google Scholar
Page, E. 1994. Computer grading of student prose, using modern concepts and software. Journal of Experimental Education 2(62), 127142.Google Scholar
Palmer, K., Richardson, P. 2003. On-line assessment and free-response input—a pedagogic and technical model for squaring the circle. In Proceedings of the 7th Computer Assisted Assessment Conference, Loughborough, UK.Google Scholar
Parsons, H., Schofield, D., Woodget, S. 2003. Piloting summative Web assessment in secondary education. In Proceedings of the 7th Computer Assisted Assessment Conference, Loughborough, UK.Google Scholar
Pérez, D., Gliozzo, A., Strapparava, C., Alfonseca, E., Rodríguez, P., Magnini, B. 2005. Automatic assessment of students’ free-text answers underpinned by the combination of a Bleu-inspired algorithm and latent semantic analysis. In Proceedings of the 18th International Conference of the Florida Artificial Intelligence Research Society, American Association for Artificial Intelligence (AAAI), Menlo Park, California.Google Scholar
Pérez-Marín, D., Alfonseca, E., Rodríguez, P., Pascual-Nieto, I. 2006. Willow: Automatic and adaptive assessment of students free-text answers. In Proceedings of the 22nd International Conference of the Spanish Society for the Natural Language Processing (SEPLN), Zaragoza, Spain.Google Scholar
Pérez-Marín, D., Alfonseca, E., Rodríguez, P., Pascual-Nieto, I. 2007. Automatic generation of students’ conceptual models from answers in plain text. In Proceedings of the User Modeling International Conference, Conati, C., McCoy, K. & Paliouras, G. (eds). Lecture Notes in Artificial Intelligence 4511, 329–333. Springer-Verlag.Google Scholar
Quinlan, J. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers.Google Scholar
Rosé, C., Roque, A., Bhembe, D., VanLehn, K. 2003. A hybrid text classification approach for analysis of student essays. In Proceedings of the HLT-NAACL Workshop on Educational Applications of NLP, Edmonton, Canada.Google Scholar
Rudner, L., Gagne, P. 2001. An overview of three approaches to scoring written essays by computer. Educational Resources Information Center (ERIC) digest, ERIC Clearinghouse on Assessment and Evaluation, College Park, MD.Google Scholar
Rudner, L., Liang, T. 2002. Automated essay scoring using bayes’ theorem. In Proceedings of the Annual Meeting of the National Council on Measurement in Education, New Orleans, LA.Google Scholar
Salton, G. 1989. Automatic Text Processing: the Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley.Google Scholar
Salton, G., Wong, A., Yang, C. 1975. A vector space model for automatic indexing. Communications of the ACM 11(18), 613620.Google Scholar
Sealey, C., Humphries, P., Reppert, D. 2003. At the coal face. Experiences of computer-based exams. In Proceedings of the 7th Computer Assisted Assessment Conference, Loughborough, UK.Google Scholar
Shermis, M., Koch, C., Page, E., Keith, T., Harrington, S. 2002. Trait rating for automated essay scoring. Educational and Psychological Measures 62, 518.Google Scholar
Streeter, L., Pstoka, J., Laham, D., MacCuish, D. 2003. The credible grading machine: Automated essay scoring in the DOD. In Proceedings of Interservice/Industry, Simulation and Education Conference (I/ITSEC), Orlando, Florida, USA.Google Scholar
Sukkarieh, J., Pulman, S., Raikes, N. 2003. Auto-marking: using computational linguistics to score short, free text responses. In Proceedings of the 29th IAEA Conference, Theme: Societies’ Goals and Assessment, Philadelphia, USA.Google Scholar
Valenti, S., Neri, F., Cucchiarelli, A. 2003. An overview of current research on automated essay grading. Journal of Information Technology Education 2, 319330.Google Scholar
van Rijsbergen, C. J. 1979. Information Retrieval. Butterworths.Google Scholar
Vantage Learning Technology 2000. A Study of Expert Scoring and Intellimetric Scoring Accuracy for Dimensional Scoring of Grade 11 Student Writing Responses. Technical Report RB-397, Vantage, USA.Google Scholar
Vantage Learning Technology 2001. A Preliminary Study of the Efficacy of Intellimetric for Use in Scoring Hebrew Assessments. Technical Report RB-561, Vantage, USA.Google Scholar
Whittingdon, D., Hunt, H. 1999. Approaches to the computerised assessment of free-text responses. In Proceedings of the 3rd International Computer Assisted Assessment Conference, Loughborough, UK.Google Scholar
Wiemer-Hastings, P., Graesser, A. 2000. Select-a-kibitzer: A computer tool that gives meaningful feedback on student compositions. Interactive Learning Environments 8(2), 149169.CrossRefGoogle Scholar
Wiemer-Hastings, P., Allbritton, D., Arnott, E. 2004. RMT: A dialog-based research methods tutor with or without a head. In Proceedings of the 7th International Conference on Intelligent Tutoring Systems, Springer-Verlag, Berlin.Google Scholar
Wiemer-Hastings, P., Graesser, A., Harter, D., the Tutoring Research Group 1998. The foundations and architecture of Autotutor. In Proceedings of the 4th International Conference on Intelligent Tutoring Systems, Springer-Verlag, New York, 334–343.Google Scholar
Williams, R. 2001. Automated essay grading: an evaluation of four conceptual models. In Proceedings of the 10th Annual Teaching and Learning Forum: Expanding Horizons in Teaching and Learning, Curtin University of Technology, Perth, Australia.Google Scholar
Williams, R., Dreher, H. 2004. Automatically Grading Essays with Markit. In Proceedings of Informing Science Conference, Rockhampton, Queensland, Australia.Google Scholar