Principles of induction and approaches to attribute based induction

G. Kalkanis; G. V. Conroy

doi:10.1017/S0269888900005920

Principles of induction and approaches to attribute based induction

Published online by Cambridge University Press: 07 July 2009

G. Kalkanis and

G. V. Conroy

Show author details

G. Kalkanis: Affiliation:
Department of Computation, University of Manchester Institute of Science and Technology, Manchester M60 1QD, UK
G. V. Conroy: Affiliation:
Department of Computation, University of Manchester Institute of Science and Technology, Manchester M60 1QD, UK

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

This paper presents a survey of machine induction, studied mainly from the field of artificial intelligence, but also from the fields of pattern recognition and cognitive psychology. The paper consists of two parts: Part I discusses the basic principles and features of the machine induction process; Part II uses these principles and features to review and criticize the major supervised attribute-based induction methods. Attribute-based induction has been chosen because it is the most commonly used inductive approach in the development of expert systems and pattern recognition models.

Information

Type: Research Article
Information: The Knowledge Engineering Review , Volume 6 , Issue 4 , December 1991 , pp. 307 - 333

DOI: https://doi.org/10.1017/S0269888900005920 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 1991

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Amsterdam, J, 1988a. “Extending the Valiant learning model” In: Proceedings of 5th International Workshop on Machine Learning. Morgan-Kaufmann.Google Scholar

Amsterdam, J, 1988b. “Some philosophical problems with formal learning theory” In: Proceedings AAAl'88. Morgan-Kaufmann.Google Scholar

Anderberg, MR, 1973. Cluster Analysis for Applications. Academic Press, New York, NY.Google Scholar

Angluin, D and Laird, PD, 1988. “Learning from noisy examples” In: Machine Learning 2(4), 343–370.CrossRef Google Scholar

Angluin, D and Smith, CH, 1983. “Inductive inference: theory and methods” ACM Computing Surveys 15(3), 237–269.Google Scholar

Arbab, B and Michie, D, 1985. “Generating rules from examples” In: Proceedings 9th IJCAI'85. Morgan-Kaufmann.Google Scholar

Baim, PW, 1988. “A method for attribute selection in inductive learning systems” IEEE Transactions on Pattern Analysis and Machine Intelligence 10(6), 888–896.Google Scholar

Bierman, AW and Feldman, JA, 1972. “A survey of results in grammatical inference” In: Watanabe, S (ed.), Frontiers of Pattern Recognition. Academic Press, New York, NY.Google Scholar

Blumer, A, Ehrenfeucht, A, Haussler, D and Warmuth, M, 1986. “Classifying learnable geometric concepts with the Vapnik–Chervonenkis dimension” In: Proceedings 18th ACM Symposium. ACM.Google Scholar

Blumer, A, Ehrenfeucht, A, Haussler, D and Warmuth, M, 1987. “Occam's razor” Information Processing Letters 24(6), 377–380.Google Scholar

Blythe, J, 1988. “Constraining search in a hierarchical discriminative learning system” In: Proceedings ECAI'88. Pitman.Google Scholar

Boucheron, S and Sallantin, J, 1988. “Learnability in the presence of noise” In: Proceedings EWSL'88. Pitman.Google Scholar

Breiman, L, Friedman, JH, Olshen, RA and Stone, CJ, 1984. “Classification and regression trees” Wadsworth Int. Group, Belmont, CA.Google Scholar

Buchanan, BG and Mitchell, TM, 1978. “Model directed learning of production rules” In: Waterman, DA and Hayes-Roth, F (eds.), Pattern Directed Inference Systems. Academic Press, New York, NY.Google Scholar

Bundy, A, Silver, D and Plummer, D, 1985. “An analytical comparison of some rule learning programs” Artificial Intelligence 27(2), 137–181.Google Scholar

Buntine, R, 1989. “A Critique of the Valiant model” In Proceedings 11th IJCAI-89. Morgan-Kaufmann.Google Scholar

Buntine, W and Stirling, KA, 1989. “Interactive induction” In: Hayes, J and Michie, D (eds.), Machine Intelligence 12. Oxford University Press, Oxford.Google Scholar

Carnap, R, 1950. The Logical Foundations of Probability. Chicago.Google Scholar

Carter, C and Catlett, J, 1987. “Assessing credit card applications using machine learning” IEEE Expert 2(3), 71–79.Google Scholar

Cestnik, B, Kononenko, I and Bratko, I, 1987. “ASSISTANT'86: a knowledge elicitation tool for sophisticated users” In: Bratko, I and Lavrac, N (eds.), Progress in Machine Learning. Sigma Press, Wilmslow.Google Scholar

Chan, PK, 1989. “Inductive learning with BCT” In: Proceedings 6th International Workshop on Machine Learning. Morgan-Kaufmann.Google Scholar

Chendrowska, J, 1987. “PRISM: an algorithm for inducing modular rules” International Journal of Man-Machine Studies 27(4), 349–370.CrossRef Google Scholar

Chow, CK, 1957. “An optimum character recognition system using decision functions” IRE Transactions on Electronic Computers EC-6 (12 1957), 247–254.CrossRef Google Scholar

Chrisman, L, 1989. “Extending the Valiant framework to detect incorrect bias” Technical Report, School of Computer Science, Carnegie Mellon University, CMU-CS-89–137.Google Scholar

Clark, P and Niblett, T, 1987. “Induction in noisy domains” In: Progress in Machine Learning. Sigma Press, Wilmslow.Google Scholar

Devijver, PA and Kittler, J, 1982. Pattern Recognition: A Statistical Approach. Prentice-Hall, London.Google Scholar

Dietterich, TG and Michalski, RS, 1985. “Discovering patterns in sequences of events” Artificial Intelligence 25(2), 187–232.CrossRef Google Scholar

Drastal, G, Meunier, R and Raatz, S, 1989. “Error correction in constructive induction” In: Proceedings 6th International Workshop on Machine Learning. Morgan-Kaufmann.Google Scholar

Duda, RO and Hart, PE, 1973. Pattern Classification and Scene Analysis. Wiley, Chichester.Google Scholar

Efron, B, 1983. “Estimating the error rate of a prediction rule” Journal of the American Statistical Association 78(382), 316–331.Google Scholar

Feigenbaum, EA, 1981. “Expert systems in the 1980s” In: Bond, A (ed.), State of the Art Report on Machine Intelligence. Pergamon-Infotech, Maidenhead.Google Scholar

Fisher, DH, 1987. “Conceptual clustering, learning from examples and inference” In: Proceedings 4th International Conference on Machine Learning.Morgan-Kaufmann.CrossRef Google Scholar

Fu, KS, 1974. Syntactic methods in Pattern Recognition. Academic Press, New York. NY.Google Scholar

Gams, M and Lavrac, N, 1983. “Review of five empirical learning systems within a proposed schemata” In: Bratko, I and Lavrac, N (eds.), Progress in Machine Learning. Sigma Press, Wilmslow.Google Scholar

Ganascia, JG, 1987. “Learning with Hubert cubes” In: Bratko, I and Lavrac, N (eds.), Progress in Machine Learning. Sigma Press, Wilmslow.Google Scholar

Glick, N, 1978. “Additive estimators for probabilities of correct classification” Pattern Recognition 10(3), 211–222.Google Scholar

Gross, KP, 1988. “Incremental multiple concept learning using experiments” In: Proceedings 5th International Workshop on Machine Learning. Morgan-Kaufmann.Google Scholar

Haussler, D, 1987. “Bias, version spaces and the Valiant learning Framework” In: Proceedings 4th International Workshop on Machine Learning. Morgan-Kaufmann.Google Scholar

Haussler, D, 1988a. “Quantifying inductive bias: AI learning algorithms and the Valiant learning framework” Artificial Intelligence 36(2), 177–221.Google Scholar

Haussler, D, 1988b. “Space efficient learning algorithms” University of California, Santa Cruz, UCSC–CRL–88–2.Google Scholar

Hayes-Roth, F, 1974. “Schematic classification problems and their solution” Pattern Recognition 6(2), 105–113.CrossRef Google Scholar

Highleyman, WH, 1962. “The design and analysis of pattern recognition experiments” Bell Systems Technical Journal. 03.CrossRef Google Scholar

Iba, W, Wogulis, J and Langley, P, 1988. “Trading off simplicity and coverage in incremental concept learning systems” In: Proceedings 5th International Workshop on Machine Learning. Morgan-Kaufmann.Google Scholar

Jain, AK and Dubes, R, 1978. “Feature definition in pattern recognition with small sample size” Pattern Recognition 10(2), 85–97.CrossRef Google Scholar

Kalkanis, G, 1989. “A proposal to enhance decision tree based inductive systems” MSc Dissertation, University of Manchester Institute of Science and Technology.Google Scholar

Kalkanis, G, 1991. “The application of confidence interval error analysis to the design of decision tree classifiers” (To appear in Pattern Recognition Letters.)Google Scholar

Kanal, LN and Chandrasekaran, B, 1971. “On dimensionality and sample size in statistic pattern recognition” Pattern Recognition 3(3), 225–234.Google Scholar

Kass, GV, 1980. “An exploratory technique for investigating large quantities of categorical data” Applied Statistics 29(2) 119–127.CrossRef Google Scholar

Kearns, M, Li, M, Pitt, , Land Valiant, L, 1987. “Recent results on Boolean concept learning” In: Proceedings 4th International Workshop on Machine Learning. Morgan-Kaufmann.Google Scholar

Keynes, JM, 1921. A Treatise on Probability. Macmillan.Google Scholar

Kocabas, S, 1991. “Knowledge representation and learning” Knowledge Engineering Review 6(3).Google Scholar

Kodratoff, Y, 1988. Introduction to Machine Learning. Pitman.Google Scholar

Langley, P, Simon, HA and Bradshaw, GL, 1984. “Heuristics for empirical discovery” Technical Report, Carnegie Mellon University Robotics Institute.Google Scholar

Lee, WD and Ray, SR, 1986. “Rule refinement using the probabilistic rule generator” In: Proceedings AAAI'86. Morgan-Kaufmann.Google Scholar

Leith, P, 1984. “Hierarchically structured production rules” The Computer Journal 26(1), 1–5.CrossRef Google Scholar

Lenat, DB, 1983. “The role of heuristics in learning by discovery: three case studies” In: Michalski, RS, Carbonell, JG and Mitchell, TM (eds.), Machine Learning: An Artificial Intelligence Approach (vol. I) Tioga.Google Scholar

Lewis, PW, 1962. “The characteristic selection problem in recognition systems” IRE Transactions on Information Theory, IT–8 (02 1962), 171–178.Google Scholar

Matheus, CJ and Rendell, LA, 1989. “Constructive induction on decision trees” In: Proceedings 11th IJCAI'89. Morgan-Kaufmann.Google Scholar

Medin, DI and Wattenmaker, WD, 1986. “Categories, cohesiveness, theories and cognitive archaeology” In: Concepts Reconsidered: The Ecological and Intellectual Bases of Categories. Neisser, U (ed.). Cambridge University Press.Google Scholar

Mervis, CB and Rosch, E, 1981. “Categorisation and natural objects” Annual Review of Psychology, 32(2), 89–115.CrossRef Google Scholar

Michalski, RS, 1973. “AQVAL/1—computer implementation of a variable-valued logic system and the application to pattern recognition” In: Proceedings 1st International Joint Conference on Pattern Recognition.Washington.Google Scholar

Michalski, RS, 1983. “A theory and methodology of inductive learning” In: Michalski, RS, Carbonell, JG and Mitchell, TM (eds.), Machine Learning: An Artificial Intelligence Approach (Vol I) Tioga.Google Scholar

Michalski, RS and Chilauski, RL, 1980. “Learning by being told and learning from examples: an experimental comparison of the two methods of knowledge acquisition in the context of developing an expert system for soyabean disease diagnosis” International Journal of Policy Analysis and Information Systems 4(2), 125–161.Google Scholar

Michalski, RS and Stepp, RE, 1983. “Learning from observation: conceptual clustering” In: Michalski, RS, Carbonell, JG and Mitchell, TM (eds.), Machine Learning: An Artificial Intelligence Approach (vol. I). Tioga.CrossRef Google Scholar

Michalski, RS, Carbonell, JG and Mitchell, TM, 1983. Machine Learning: An Artificial Intelligence Approach (vol. I) Tioga.CrossRef Google Scholar

Michalski, RS, Carbonell, JG and Mitchell, TM, 1986a. Machine Learning: An Artificial Intelligence Approach (vol. II) Morgan-Kaufman.Google Scholar

Michalski, RS, Mozetic, I, Hong, N and Lavrac, N, 1986b. “The multi-purpose incremental learning system AQ15 and its testing application to three medical domains” In: Proceedings AAAI'86. Morgan-Kaufmann.Google Scholar

Michie, D, 1988. “Machine learning in the next five years” In: Proceedings EWSL'88. Pitman.Google Scholar

Milosavljevic, A, 1988. “Learning in the presence of background knowledge” Technical Report, Univ. Santa Cruz, California, UCSR-CRL–87–27.Google Scholar

Mingers, J, 1989a. “An empirical comparison of pruning methods for decision tree induction” Machine Learning 3(3), 319–342.CrossRef Google Scholar

Mingers, J, 1989b. “An empirical comparison of pruning methods for decision tree induction” Machine Learning 4(2), 227–248.Google Scholar

Mitchell, TM, 1980. “The need for biases in learning generalisations” Technical Report CBM-TR–117, Department of Computer Science, Rutgers University.Google Scholar

Mitchell, TM, 1982. “Generalisation as search” In: Webber, BL and Nilsson, NJ (eds.), Readings in Artificial Intelligence. Troga Publishing Company.Google Scholar

Mortimer, H, 1988. The Logic of Induction Ellis Horwood.Google Scholar

Muggleton, S, 1987. “Structuring knowledge by asking questions” In: Bratko, I and Lavrac, N (eds.), Progress in Machine Learning. Sigma Press, Wilmslow.Google Scholar

Muggleton, S, 1988. “A strategy for constructing new predicates in first order logic” In: Proceedings EWSL'88. Pitman.Google Scholar

Muggleton, S, 1991. “Inductive logic programming” New Generation Computing 8(4), 295–318.CrossRef Google Scholar

Murray, KS, 1987. “Multiple convergence: an approach to disjunctive concept acquisition” In: Proceedings 10th IJCAI'87. Morgan-Kaufmann.Google Scholar

Neyman, J, 1957. “Inductive behaviour as a basic concept of philosophy and science” Review of the International Statistical Institute 25.Google Scholar

Nilsson, N, 1965. Learning Machines: Foundations of Trainable Pattern Classifying Systems. McGraw-Hill.Google Scholar

Pagallo, G, 1989. “Learning DNF by decision trees” In: Proceedings 11th IJCAI-89. Morgan-Kaufmann.Google Scholar

Pagallo, G and Haussler, D, 1988. “Feature discovery in empirical learning” University of California at Santa Cruz, Technical Report no. UCSC-CRL–88–08.Google Scholar

Payne, HJ and Meisel, WS, 1977. “An algorithm for constructing optimal binary decision trees”. IEEE Transactions on Computers C–26 (a) 09, 905–916.CrossRef Google Scholar

Pearl, J, 1985. “Learning hidden causes from empirical data” In: Proceedings 9th IJCAI'85. Morgan-Kaufmann.Google Scholar

Plotkin, GD, 1971. “A further note on inductive generalisation” In: Machine Intelligence 6. Meltzer, B and Michie, D (eds.). Elsevier, New York.Google Scholar

Quinlan, JR, 1986. “Induction of decision trees” In: Machine Learning 7(1), 81–106.Google Scholar

Quinlan, JR, 1987a. “Generating production rules from decision trees” In: Proceedings 10th IJCAI'87. Morgan-Kaufmann.Google Scholar

Quinlan, JR, 1987b. “Simplifying decision trees” International journal of Man-Machine Studies 27(3), 221–234.CrossRef Google Scholar

Quinlan, JR, 1989a. “Unknown attribute values in induction” In: Proceedings 6th International Workshop on Machine Learning. Morgan-Kaufmann.Google Scholar

Quinlan, JR, 1989b. “Learning relations: comparison of a symbolic and a connectionist approach” Technical Report 346, University of Sydney.Google Scholar

Quinlan, JR, Compton, PJ, Horn, KA and Lazarus, L, 1986. “Inductive knowledge acquisition: a case study” In: Proceedings Second Australian Conference on Applications of Expert Systems.Sydney.Google Scholar

Rappaport, AT and Gaines, BR, 1990. “Integrated knowledge base building environments” Knowledge Acquisition 2(1), 51–72.Google Scholar

Reinke, RE and Michalski, RS, 1988. Incremental learning of concept descriptions: a method and experimental results” In: Hay, JE and Michie, D (eds.), Machine Intelligence 11. Oxford Press.Google Scholar

Rendell, L, 1986. “A general framework for induction and a study of selective induction” In: Machine Learning 1(2), 177–226.Google Scholar

Rendell, L, 1988. “Learning hard concepts through constructive induction: framework and rationale” Technical Report, Univ. of Illinois at Urbana Champaign, UIUCDS-R- 88–1426.Google Scholar

Rendell, L, Seshu, R and Cheng, D, 1987. “More robust concept learning using dynamically variable bias management” In: Proceedings 10th IJCAI'87. Morgan-Kaufmann.Google Scholar

Schlimmer, JC and Fisher, D, 1986. “A case study of incremental concept induction” In: Proceedings AAAI'86. Morgan-Kaufmann.Google Scholar

Schlimmer, JC, 1987. “Incremental adjustment of representations for learning” In: Proceedings 4th International Workshop on Machine Learning. Morgan-Kaufmann.Google Scholar

Sebestyen, GS, 1962. “Pattern recognition by an adaptive process of sample set construction” IRE Transactions on Information Theory IT–8 09, 82–91.Google Scholar

Shapiro, AD, 1987. Structured Induction For Expert Systems Turing Institute Press, Addison-Wesley.Google Scholar

Thornton, C, 1987. “Hypercuboid-formation behaviour of two learning algorithms” In: Proceedings 10th IJCAl'87. Morgan-Kaufmann.Google Scholar

Toussaint, GT, 1974. “Bibliography on estimation of misclassification” IEEE Transactions on Information Theory IT–20 (4) 07, 472–479.Google Scholar

Utgoff, PE, 1986. Machine Learning of Inductive Bias Kluwer.CrossRef Google Scholar

Utgoff, PE, 1988. “ID5: an incremental IDS” In: Proceedings 5th International Workshop on Machine Learning. Morgan-Kaufmann.Google Scholar

Valiant, L, 1984. “A theory of the learnable” Communications of the ACM 27(11), 1134–1142.CrossRef Google Scholar

Vapnik, VN and Chervonenkis, AY, 1971. “On the uniform convergence of realative frequencies of events to their probabilities” Theory of Probability and its Applications 16(2), 264–280.Google Scholar

Vapnik, VN, 1989. “Inductive principles for the search for empirical dependencies (methods based on weak convergence of probability measures)” In: Proceedings Second Annual Workshop on Computational Learning Theory. Santa Cruz. CA.Morgan-Kaufmann.Google Scholar

Watanabe, S, 1972. “Pattern recognition as information compression” In: Watanabe, S (ed.), Frontiers of Pattern Recognition. Academic Press.Google Scholar

Watanabe, S, 1985. Pattern Recognition: Human and Mechanical Wiley.Google Scholar

Wilkins, DC and Buchanan, BG, 1986. “On debugging rule-sets when reasoning under uncertainty” In: Proceedings AAAI'86. Morgan-Kaufmann.Google Scholar

Winston, PH, 1975. The Psychology of Computer Vision McGraw-Hill.Google Scholar

Wirth, J and Catlett, J, 1988. “Experiments on the costs and benefits of windowing in ID3” In: Proceedings 5th International Workshop on Machine Learning. Morgan-Kaufmann.Google Scholar

Article contents

Principles of induction and approaches to attribute based induction

Abstract

Information

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests