A multifaceted approach to evaluating expert systems

Leonard Adelman; James Gualtieri; Sharon L. Riedel

doi:10.1017/S0890060400000974

A multifaceted approach to evaluating expert systems

Published online by Cambridge University Press: 27 February 2009

Leonard Adelman ,

James Gualtieri and

Sharon L. Riedel

Show author details

Leonard Adelman: Affiliation:
Department of Operations Research and Engineering, George Mason University, Fairfax, VA 22030
James Gualtieri: Affiliation:
Department of Psychology, George Mason University, Fairfax, VA 22030
Sharon L. Riedel: Affiliation:
U.S. Army Research Institute, ATTN: PERI-RK, P.O. Box 3407, Ft., Leavenworth, KS 66027

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

A multifaceted approach to evaluating expert systems is overviewed. This approach has three facets: a technical facet, for “looking inside the black box”; an empirical facet, for assessing the system’s impact on performance; and a subjective facet, for obtaining users’ judgments about the system. Such an approach is required to test the system against the different types of criteria of interest to sponsors and users and is consistent with evolving lifecycle paradigms. Moreover, such an approach leads to the application of different evaluation methods to answer different types of evaluation questions. Different evaluation methods for each facet are overviewed.

Keywords

Expert Systems Knowledge-based Systems Evaluation Testing

Information

Type: Articles
Information: AI EDAM , Volume 8 , Issue 4 , Fall 1994 , pp. 289 - 306

DOI: https://doi.org/10.1017/S0890060400000974 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 1994

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

REFERENCES

Adelman, L. (1989). Integrating evaluation methods into the DSS development process. Information and Decision Technologies 15, 227–241.Google Scholar

Adelman, L. (1991). Experiments, quasi-experiments, and case studies: Empirical methods for evaluating decision support systems. IEEE Transactions on Systems, Man, and Cybernetics SMC-21, 293–301.CrossRef Google Scholar

Adelman, L. (1992). Evaluating Decision Support and Expert Systems, Wiley-Interscience, New York.Google Scholar

Adelman, L.Donnell, M.L. (1986). Evaluating decision support systems: A general framework and case study. In Microcomputer Decision Support Systems: Design, Implementation, and Evaluation, (Andriole, S.J., Ed.). QED Information Sciences, Wellesley, Massachusetts.Google Scholar

Adelman, L., Rook, F.W., & Lehner, P.E. (1985). User and R&D specialist evaluation of decision support systems: Development of a questionnaire and empirical results. IEEE Transactions on Systems, Man, and Cybernetics SMC-15, 334–342.Google Scholar

Adelman, L., Ulvila, J.W., & Lehner, P.E. (1990). Testing and Evaluating C3I Systems that Employ AI, Vol. 1. Decision Sciences Consortium, Inc., Reston, Virginia.Google Scholar

Andriole, S.J. (1989). Handbook for the Design, Development, Evaluation, and Application of Interactive Decision Support Systems. Petrocelli, Princeton, New Jersey.Google Scholar

Bahill, T.A. (1991). Verifying and Validating Personal Computer-Based Expert Systems. Prentice Hall, Englewood Cliffs, New Jersey.Google Scholar

Bailey, R.W. (1989). Human Performance Engineering: Using Human Factors/Ergonomics to Achieve Computer System Usability, 2nd ed. Prentice Hall, Englewood Cliffs, New Jersey.Google Scholar

Berry, D.C., & Hart, A.E. (1990). Evaluating expert systems. Expert Systems 7, 199–207.Google Scholar

Boehm, B.W. (1984). Verifying and validating software requirements and design specifications. IEEE Software I, Jan., 75–88.Google Scholar

Boehm, B.W. (1988). A spiral model of software development and enhancement. IEEE Computer 21, May, 61–72.Google Scholar

Campbell, D.T., & Stanley, J.C. (1966). Experimental and Quasi-Experimental Designs for Research. Rand McNally, Chicago.Google Scholar

Casey, J. (1989). Picking the right expert system application. AI Expert 4, 44–47.Google Scholar

Cats-Baril, W.L., & Huber, G.P. (1987). Decision support systems for ill-structured problems: An empirical study. Decision Sciences 18, 350–372.CrossRef Google Scholar

Cholawsky, E.M. (1988). Beating the prototyping blues. AI Expert 3, 42–49.Google Scholar

Clegg, C., Warr, P., Green, T., Monk, A., Kemp, N., Allison, G., & Lansdale, M. (1988). People and Technology: How to Evaluate Your Company's New Technology. Ellis-Horwood, London.Google Scholar

Cochran, T., & Hutchins, B. (1991). Testing, verifying and releasing an expert system: The case history of Mentor. In Validating and Verifying Knowledge-Based Systems, (Gupta, U., Ed.). IEEE Computer Society Press, Los Alamitos, California.Google Scholar

Cochran, W.G., & Cox, G.M. (1957). Experimental Designs, 2nd ed. John Wiley & Sons, New York.Google Scholar

Constantine, M.M., & Ulvila, J.W. (1990). Testing knowledge-based systems: The state of the practice and suggestions for improvement. Expert Systems with Applications 1, 237–248.Google Scholar

Cook, T.D., & Campbell, D.T. (1979). Quasi-Experimentation: Design & Analysis Issues for Field Settings. Houghton Mifflin, Boston.Google Scholar

Culbert, C., & Savely, R.T. (1988). Expert system verification and validation. Proc. AAAI-88 Workshop on Validation and Testing Knowledge-Based Systems, St. Paul, Minnesota.Google Scholar

Davis, R. (1989). Expert systems: How far can they go? AI Magazine 10, 65–77.Google Scholar

Dawes, R.M., & Corrigan, B. (1974). Linear models in decision making. Psychological Bulletin 81, 97–106.CrossRef Google Scholar

DeMillo, R.A., McCracken, W.M., Martin, R.J., & Passafiume, J.F. (1987). Software Testing and Evaluation. Benjamin/Cummings, Menlo Park, California.Google Scholar

Ebert, R.J., & Kruse, T.E. (1978). Bootstrapping the security analyst. Journal of Applied Psychology 63, 110–119.Google Scholar

Edwards, W. (1977). Use of multiattribute utility measurement for social decisions. In Conflicting Objectives in Decisions, (Bell, D.E., Keeney, R.L., & Raiffa, H., Eds.). John Wiley & Sons, New York.Google Scholar

Einhorn, H.J., & Hogarth, R.M. (1975). Unit weighting schemes of decision making. Organizational Behavior and Human Performance 13, 171–192.Google Scholar

Fairley, R.E. (1985). Software Engineering Concepts, McGraw-Hill, New York.Google Scholar

Gaschnig, J., Klahr, P., Pople, H., Shortliffe, E., & Terry, A. (1983). Evaluation of expert systems: Issues and case studies. In Building Expert Systems, (Hayes-Roth, F., Waterman, D.A., & Lenat, D.B., (Eds.). Addison-Wesley, Reading, Massachusetts.Google Scholar

Goldberg, L.R. (1970). Man versus model of man: A rationale, plus some evidence, for a method of improving clinical inferences. Psychological Bulletin 73, 422–432.Google Scholar

Grogono, P., Batarekh, A., Preece, A., Shinghai, R., & Suen, C. (1991). Expert system evaluation techniques: A selected bibliography. Expert Systems 8, 227–239.Google Scholar

Harmon, P., Maus, R., & Morrissey, W. (1988). Expert Systems Tools and Applications. John Wiley & Sons, New York.Google Scholar

Hoffman, P.J. (1960). The paramorphic representation of human judgment. Psychological Bulletin 57, 116–131.Google Scholar

Huber, G.P. (1980). Managerial Decision Making. Scott, Foresman, Glenview, Illinois.Google Scholar

Hurst, E.G., Jr., Ness, D.N., Gambino, T.J., & Johnson, T.H. (1983). Growing DSS: A flexible, evolutionary approach. In Building Decision Support Systems, (Bennett, J.L., Ed.). Addison-Wesley, Reading, Massachusetts.Google Scholar

Kang, Y., & Bahill, A.T. (1990). A tool for detecting expert system errors. AI Expert 5, 46–51.Google Scholar

Keeney, R.L., & Raiffa, H. (1976). Decisions With Multiple Objectives. John Wiley & Sons, New York.Google Scholar

Kirk, D.B., & Murray, A.E. (1988). Verification and Validation of Expert Systems For Nuclear Power Applications. Science Applications International Corporation, McLean, Virginia.Google Scholar

Kirwan, B., & Ainsworth, L.K., Eds. (1992). A Guide to Task Analysis. Taylor and Francis, Washington, DC.Google Scholar

Klein, G., & Brezovic, C. (1989). Evaluation of expert systems. In Defense Applications of AI, (Andriole, S.J., & Hopple, G.W., Eds.). Lexington Books, Lexington, Massachusetts.Google Scholar

Lee, A.S. (1989). A scientific methodology for MIS case studies. MIS Quarterly 13, 33–50.Google Scholar

Lehner, P.E. (1989). Toward a mathematics for evaluating the knowledge base of an expert system. IEEE Transactions on Systems, Man, and Cybernetics SMC-19, 658–662.Google Scholar

Levi, K. (1989). Expert systems should be more accurate than human experts: Lessons from human judgment and decision making. IEEE Transactions on Systems, Man, and Cybernetics SMC-19, 647–657.Google Scholar

Liebowitz, J. (1986). Useful approach for evaluating expert systems. Expert Systems 3, 86–96.Google Scholar

Marcot, B. (1987). Testing Your Knowledge Base. AI Expert 2, 42–47.Google Scholar

Nazareth, D.L. (1989). Issues in the verification of knowledge in rule based systems. International Journal of Man-Machine Studies 30, 255–271.Google Scholar

Nguyen, T.A., Perkins, W.A., Laffey, T.J., & Pecora, D. (1987). Knowledge base verification. AI Magazine 8, 69–75.Google Scholar

O’Keefe, R.M., Balci, O., & Smith, E.P. (1987). Validating expert system performance. IEEE Expert 2, 81–90.Google Scholar

O’Keefe, R.M., & Lee, S. (1990). An integrative model of expert system verification and validation. Expert Systems With Applications 1, 231–236.Google Scholar

Poison, P.G., Lewis, C., Rieman, J., & Wharton, C. (1992). Cognitive walkthrough: A method for theory-based evaluation of user interfaces. International Journal of Man-Machine Studies 36, 741–773.Google Scholar

Preece, A.D. (1990). Towards a methodology for evaluating expert systems. Expert Systems 7, 215–223.Google Scholar

Pressman, R.S. (1982). Software Engineering: A Practitioner’s Approach. McGraw-Hill, New York.Google Scholar

Reich, Y. (in press). The value of design knowledge. Knowledge Acquisition.Google Scholar

Riedel, S.L., McKown, P.E., Flanagan, J.P., & Adelman, L. (1994). Evaluation of the AirLand Battle Management Advanced Technology Demonstration Prototype Version 1.2: Knowledge Base Assessment of the Avenue of Approach Comparison Tool (ARI Research Product 94–9). Army Research Institute for the Behavioral and Social Sciences, Alexandria, Virginia.Google Scholar

Riedel, S.L., & Pitz, G.F. (1986). Utilization-oriented evaluation of decision support systems. IEEE Transactions on Systems, Man, and Cybernetics SMC-16, 980–996.Google Scholar

Rook, F.W., & Croghan, J.W. (1989). The knowledge acquisition activity matrix: A systems engineering conceptual framework. IEEE Transactions on Systems, Man, and Cybernetics SMC-19, 586–597.Google Scholar

Rushby, J. (1988). Quality Measures and Assurance for AI Software, (NASA Contractor Report 4187). National Aeronautics and Space Administration (Code NTT-4), Washington, DC.Google Scholar

Saaty, T.L. (1980). The Analytic Hierarchy Process. McGraw-Hill, New York.Google Scholar

Sage, A.P. (1990). Multiattribute Utility Theory. In Concise Encyclopedia of Information Processing in Systems and Organizations, (Sage, A.P., Ed.). Pergamon Press, New York.Google Scholar

Slagle, J.R., & Wick, M.R. (1988). A method for evaluating candidate expert system applications. AI Magazine 9, 44–53.Google Scholar

Smith, S.L., & Mossier, J.N. (1986). Guidelines for Designing the User Interface Software, Report 7 MTR-1000, ESD-TR-86–275. Mitre Corporation, Bedford, Massachusetts.Google Scholar

Stachowitz, R.A., Chang, C.L., & Combs, J.B. (1988). Research on validation of knowledge-based systems. Proc. AAAI-88 Workshop on Validation and Testing Knowledge-Based Systems. St. Paul, Minnesota.Google Scholar

Stewart, T.R., Moninger, W.R., Grassia, J., Brady, R.H., & Merrem, F.H. (1988). Analysis of Expert Judgment and Skill in a Hail Forecasting Experiment. Center for Research on Judgment and Policy at the University of Colorado, Boulder, Colorado.Google Scholar

Tong, R.M., Newman, N.D., Berg-Cross, G., & Rook, F. (1987). Performance Evaluation of Artificial Intelligence Systems. Advanced Decision Systems, Mountain View, California.Google Scholar

Turban, E. (1990). Decision Support and Expert Systems: Management Support Systems, 2nd ed. MacMillan Publishing, New York.Google Scholar

Ulvila, J.W., Lehner, P.E., Bresnick, T.A., Chinnis, J.O., & Gumula, J.D.E. (1987). Testing and Evaluating C³I Systems That Employ Artificial Intelligence. Decision Sciences Consortium, Inc., Reston, Virginia.Google Scholar

Webster’s Seventh New Collegiate Dictionary. (1966). G & C Merriam Company, Springfield, Massachusetts.Google Scholar

Weitzel, J.R., & Kerschberg, L. (1989). A system development methodology for knowledge-based systems. IEEE Transactions on Systems, Man, and Cybernetics SMC-19, 598–605.Google Scholar

Yin, R.K. (1984). Case Study Research: Design and Methods. Sage, Beverly Hills, California.Google Scholar

Yu, V.L., Fagan, L.M., Wraith, S.M., Clancey, W.J., Scott, A.C., Hanigan, J.F., Blum, R.L., Buchanan, B.G., & Cohen, S.N. (1979). Antimicrobial selection by a computer: A blinded evaluation by infectious disease experts. Journal of the American Medical Association 242, 1279–1282.Google Scholar

Article contents

A multifaceted approach to evaluating expert systems

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests