Published online by Cambridge University Press: 06 December 2005
Reading comprehension (RC) tests involve reading a short passage of text and answering a series of questions pertaining to that text. We present a methodology for evaluation of the application of modern natural language technologies to the task of responding to RC tests. Our work is based on ABCs (Abduction Based Comprehension system), an automated system for taking tests requiring short answer phrases as responses. A central goal of ABCs is to serve as a testbed for understanding the role that various linguistic components play in responding to reading comprehension questions. The heart of ABCs is an abductive inference engine that provides three key capabilities: (1) first-order logical representation of relations between entities and events in the text and rules to perform inference over such relations, (2) graceful degradation due to the inclusion of abduction in the reasoning engine, which avoids the brittleness that can be problematic in knowledge representation and reasoning systems and (3) system transparency such that the types of abductive inferences made over an entire corpus provide cues as to where the system is performing poorly and indications as to where existing knowledge is inaccurate or new knowledge is required. ABCs, with certain sub-components not yet automated, finds the correct answer phrase nearly 35 percent of the time using a strict evaluation metric and 45 percent of the time using a looser inexact metric on held out evaluation data. Performance varied for the different question types, ranging from over 50 percent on who questions to over 10 percent on what questions. We present analysis of the roles of individual components and analysis of the impact of various characteristics of the abductive proof procedure on overall system performance.