Hostname: page-component-78c5997874-g7gxr Total loading time: 0 Render date: 2024-11-10T20:49:05.119Z Has data issue: false hasContentIssue false

A multifaceted approach to evaluating expert systems

Published online by Cambridge University Press:  27 February 2009

Leonard Adelman
Affiliation:
Department of Operations Research and Engineering, George Mason University, Fairfax, VA 22030
James Gualtieri
Affiliation:
Department of Psychology, George Mason University, Fairfax, VA 22030
Sharon L. Riedel
Affiliation:
U.S. Army Research Institute, ATTN: PERI-RK, P.O. Box 3407, Ft., Leavenworth, KS 66027

Abstract

A multifaceted approach to evaluating expert systems is overviewed. This approach has three facets: a technical facet, for “looking inside the black box”; an empirical facet, for assessing the system’s impact on performance; and a subjective facet, for obtaining users’ judgments about the system. Such an approach is required to test the system against the different types of criteria of interest to sponsors and users and is consistent with evolving lifecycle paradigms. Moreover, such an approach leads to the application of different evaluation methods to answer different types of evaluation questions. Different evaluation methods for each facet are overviewed.

Type
Articles
Copyright
Copyright © Cambridge University Press 1994

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

REFERENCES

Adelman, L. (1989). Integrating evaluation methods into the DSS development process. Information and Decision Technologies 15, 227241.Google Scholar
Adelman, L. (1991). Experiments, quasi-experiments, and case studies: Empirical methods for evaluating decision support systems. IEEE Transactions on Systems, Man, and Cybernetics SMC-21, 293301.CrossRefGoogle Scholar
Adelman, L. (1992). Evaluating Decision Support and Expert Systems, Wiley-Interscience, New York.Google Scholar
Adelman, L.Donnell, M.L. (1986). Evaluating decision support systems: A general framework and case study. In Microcomputer Decision Support Systems: Design, Implementation, and Evaluation, (Andriole, S.J., Ed.). QED Information Sciences, Wellesley, Massachusetts.Google Scholar
Adelman, L., Rook, F.W., & Lehner, P.E. (1985). User and R&D specialist evaluation of decision support systems: Development of a questionnaire and empirical results. IEEE Transactions on Systems, Man, and Cybernetics SMC-15, 334342.Google Scholar
Adelman, L., Ulvila, J.W., & Lehner, P.E. (1990). Testing and Evaluating C3I Systems that Employ AI, Vol. 1. Decision Sciences Consortium, Inc., Reston, Virginia.Google Scholar
Andriole, S.J. (1989). Handbook for the Design, Development, Evaluation, and Application of Interactive Decision Support Systems. Petrocelli, Princeton, New Jersey.Google Scholar
Bahill, T.A. (1991). Verifying and Validating Personal Computer-Based Expert Systems. Prentice Hall, Englewood Cliffs, New Jersey.Google Scholar
Bailey, R.W. (1989). Human Performance Engineering: Using Human Factors/Ergonomics to Achieve Computer System Usability, 2nd ed. Prentice Hall, Englewood Cliffs, New Jersey.Google Scholar
Berry, D.C., & Hart, A.E. (1990). Evaluating expert systems. Expert Systems 7, 199207.Google Scholar
Boehm, B.W. (1984). Verifying and validating software requirements and design specifications. IEEE Software I, Jan., 7588.Google Scholar
Boehm, B.W. (1988). A spiral model of software development and enhancement. IEEE Computer 21, May, 6172.Google Scholar
Campbell, D.T., & Stanley, J.C. (1966). Experimental and Quasi-Experimental Designs for Research. Rand McNally, Chicago.Google Scholar
Casey, J. (1989). Picking the right expert system application. AI Expert 4, 4447.Google Scholar
Cats-Baril, W.L., & Huber, G.P. (1987). Decision support systems for ill-structured problems: An empirical study. Decision Sciences 18, 350372.CrossRefGoogle Scholar
Cholawsky, E.M. (1988). Beating the prototyping blues. AI Expert 3, 4249.Google Scholar
Clegg, C., Warr, P., Green, T., Monk, A., Kemp, N., Allison, G., & Lansdale, M. (1988). People and Technology: How to Evaluate Your Company's New Technology. Ellis-Horwood, London.Google Scholar
Cochran, T., & Hutchins, B. (1991). Testing, verifying and releasing an expert system: The case history of Mentor. In Validating and Verifying Knowledge-Based Systems, (Gupta, U., Ed.). IEEE Computer Society Press, Los Alamitos, California.Google Scholar
Cochran, W.G., & Cox, G.M. (1957). Experimental Designs, 2nd ed. John Wiley & Sons, New York.Google Scholar
Constantine, M.M., & Ulvila, J.W. (1990). Testing knowledge-based systems: The state of the practice and suggestions for improvement. Expert Systems with Applications 1, 237248.Google Scholar
Cook, T.D., & Campbell, D.T. (1979). Quasi-Experimentation: Design & Analysis Issues for Field Settings. Houghton Mifflin, Boston.Google Scholar
Culbert, C., & Savely, R.T. (1988). Expert system verification and validation. Proc. AAAI-88 Workshop on Validation and Testing Knowledge-Based Systems, St. Paul, Minnesota.Google Scholar
Davis, R. (1989). Expert systems: How far can they go? AI Magazine 10, 6577.Google Scholar
Dawes, R.M., & Corrigan, B. (1974). Linear models in decision making. Psychological Bulletin 81, 97106.CrossRefGoogle Scholar
DeMillo, R.A., McCracken, W.M., Martin, R.J., & Passafiume, J.F. (1987). Software Testing and Evaluation. Benjamin/Cummings, Menlo Park, California.Google Scholar
Ebert, R.J., & Kruse, T.E. (1978). Bootstrapping the security analyst. Journal of Applied Psychology 63, 110119.Google Scholar
Edwards, W. (1977). Use of multiattribute utility measurement for social decisions. In Conflicting Objectives in Decisions, (Bell, D.E., Keeney, R.L., & Raiffa, H., Eds.). John Wiley & Sons, New York.Google Scholar
Einhorn, H.J., & Hogarth, R.M. (1975). Unit weighting schemes of decision making. Organizational Behavior and Human Performance 13, 171192.Google Scholar
Fairley, R.E. (1985). Software Engineering Concepts, McGraw-Hill, New York.Google Scholar
Gaschnig, J., Klahr, P., Pople, H., Shortliffe, E., & Terry, A. (1983). Evaluation of expert systems: Issues and case studies. In Building Expert Systems, (Hayes-Roth, F., Waterman, D.A., & Lenat, D.B., (Eds.). Addison-Wesley, Reading, Massachusetts.Google Scholar
Goldberg, L.R. (1970). Man versus model of man: A rationale, plus some evidence, for a method of improving clinical inferences. Psychological Bulletin 73, 422432.Google Scholar
Grogono, P., Batarekh, A., Preece, A., Shinghai, R., & Suen, C. (1991). Expert system evaluation techniques: A selected bibliography. Expert Systems 8, 227239.Google Scholar
Harmon, P., Maus, R., & Morrissey, W. (1988). Expert Systems Tools and Applications. John Wiley & Sons, New York.Google Scholar
Hoffman, P.J. (1960). The paramorphic representation of human judgment. Psychological Bulletin 57, 116131.Google Scholar
Huber, G.P. (1980). Managerial Decision Making. Scott, Foresman, Glenview, Illinois.Google Scholar
Hurst, E.G., Jr., Ness, D.N., Gambino, T.J., & Johnson, T.H. (1983). Growing DSS: A flexible, evolutionary approach. In Building Decision Support Systems, (Bennett, J.L., Ed.). Addison-Wesley, Reading, Massachusetts.Google Scholar
Kang, Y., & Bahill, A.T. (1990). A tool for detecting expert system errors. AI Expert 5, 4651.Google Scholar
Keeney, R.L., & Raiffa, H. (1976). Decisions With Multiple Objectives. John Wiley & Sons, New York.Google Scholar
Kirk, D.B., & Murray, A.E. (1988). Verification and Validation of Expert Systems For Nuclear Power Applications. Science Applications International Corporation, McLean, Virginia.Google Scholar
Kirwan, B., & Ainsworth, L.K., Eds. (1992). A Guide to Task Analysis. Taylor and Francis, Washington, DC.Google Scholar
Klein, G., & Brezovic, C. (1989). Evaluation of expert systems. In Defense Applications of AI, (Andriole, S.J., & Hopple, G.W., Eds.). Lexington Books, Lexington, Massachusetts.Google Scholar
Lee, A.S. (1989). A scientific methodology for MIS case studies. MIS Quarterly 13, 3350.Google Scholar
Lehner, P.E. (1989). Toward a mathematics for evaluating the knowledge base of an expert system. IEEE Transactions on Systems, Man, and Cybernetics SMC-19, 658662.Google Scholar
Levi, K. (1989). Expert systems should be more accurate than human experts: Lessons from human judgment and decision making. IEEE Transactions on Systems, Man, and Cybernetics SMC-19, 647657.Google Scholar
Liebowitz, J. (1986). Useful approach for evaluating expert systems. Expert Systems 3, 8696.Google Scholar
Marcot, B. (1987). Testing Your Knowledge Base. AI Expert 2, 4247.Google Scholar
Nazareth, D.L. (1989). Issues in the verification of knowledge in rule based systems. International Journal of Man-Machine Studies 30, 255271.Google Scholar
Nguyen, T.A., Perkins, W.A., Laffey, T.J., & Pecora, D. (1987). Knowledge base verification. AI Magazine 8, 6975.Google Scholar
O’Keefe, R.M., Balci, O., & Smith, E.P. (1987). Validating expert system performance. IEEE Expert 2, 8190.Google Scholar
O’Keefe, R.M., & Lee, S. (1990). An integrative model of expert system verification and validation. Expert Systems With Applications 1, 231236.Google Scholar
Poison, P.G., Lewis, C., Rieman, J., & Wharton, C. (1992). Cognitive walkthrough: A method for theory-based evaluation of user interfaces. International Journal of Man-Machine Studies 36, 741773.Google Scholar
Preece, A.D. (1990). Towards a methodology for evaluating expert systems. Expert Systems 7, 215223.Google Scholar
Pressman, R.S. (1982). Software Engineering: A Practitioner’s Approach. McGraw-Hill, New York.Google Scholar
Reich, Y. (in press). The value of design knowledge. Knowledge Acquisition.Google Scholar
Riedel, S.L., McKown, P.E., Flanagan, J.P., & Adelman, L. (1994). Evaluation of the AirLand Battle Management Advanced Technology Demonstration Prototype Version 1.2: Knowledge Base Assessment of the Avenue of Approach Comparison Tool (ARI Research Product 94–9). Army Research Institute for the Behavioral and Social Sciences, Alexandria, Virginia.Google Scholar
Riedel, S.L., & Pitz, G.F. (1986). Utilization-oriented evaluation of decision support systems. IEEE Transactions on Systems, Man, and Cybernetics SMC-16, 980996.Google Scholar
Rook, F.W., & Croghan, J.W. (1989). The knowledge acquisition activity matrix: A systems engineering conceptual framework. IEEE Transactions on Systems, Man, and Cybernetics SMC-19, 586597.Google Scholar
Rushby, J. (1988). Quality Measures and Assurance for AI Software, (NASA Contractor Report 4187). National Aeronautics and Space Administration (Code NTT-4), Washington, DC.Google Scholar
Saaty, T.L. (1980). The Analytic Hierarchy Process. McGraw-Hill, New York.Google Scholar
Sage, A.P. (1990). Multiattribute Utility Theory. In Concise Encyclopedia of Information Processing in Systems and Organizations, (Sage, A.P., Ed.). Pergamon Press, New York.Google Scholar
Slagle, J.R., & Wick, M.R. (1988). A method for evaluating candidate expert system applications. AI Magazine 9, 4453.Google Scholar
Smith, S.L., & Mossier, J.N. (1986). Guidelines for Designing the User Interface Software, Report 7 MTR-1000, ESD-TR-86–275. Mitre Corporation, Bedford, Massachusetts.Google Scholar
Stachowitz, R.A., Chang, C.L., & Combs, J.B. (1988). Research on validation of knowledge-based systems. Proc. AAAI-88 Workshop on Validation and Testing Knowledge-Based Systems. St. Paul, Minnesota.Google Scholar
Stewart, T.R., Moninger, W.R., Grassia, J., Brady, R.H., & Merrem, F.H. (1988). Analysis of Expert Judgment and Skill in a Hail Forecasting Experiment. Center for Research on Judgment and Policy at the University of Colorado, Boulder, Colorado.Google Scholar
Tong, R.M., Newman, N.D., Berg-Cross, G., & Rook, F. (1987). Performance Evaluation of Artificial Intelligence Systems. Advanced Decision Systems, Mountain View, California.Google Scholar
Turban, E. (1990). Decision Support and Expert Systems: Management Support Systems, 2nd ed. MacMillan Publishing, New York.Google Scholar
Ulvila, J.W., Lehner, P.E., Bresnick, T.A., Chinnis, J.O., & Gumula, J.D.E. (1987). Testing and Evaluating C3I Systems That Employ Artificial Intelligence. Decision Sciences Consortium, Inc., Reston, Virginia.Google Scholar
Webster’s Seventh New Collegiate Dictionary. (1966). G & C Merriam Company, Springfield, Massachusetts.Google Scholar
Weitzel, J.R., & Kerschberg, L. (1989). A system development methodology for knowledge-based systems. IEEE Transactions on Systems, Man, and Cybernetics SMC-19, 598605.Google Scholar
Yin, R.K. (1984). Case Study Research: Design and Methods. Sage, Beverly Hills, California.Google Scholar
Yu, V.L., Fagan, L.M., Wraith, S.M., Clancey, W.J., Scott, A.C., Hanigan, J.F., Blum, R.L., Buchanan, B.G., & Cohen, S.N. (1979). Antimicrobial selection by a computer: A blinded evaluation by infectious disease experts. Journal of the American Medical Association 242, 12791282.Google Scholar