Hostname: page-component-78c5997874-t5tsf Total loading time: 0 Render date: 2024-11-15T07:06:01.524Z Has data issue: false hasContentIssue false

Principles of induction and approaches to attribute based induction

Published online by Cambridge University Press:  07 July 2009

G. Kalkanis
Affiliation:
Department of Computation, University of Manchester Institute of Science and Technology, Manchester M60 1QD, UK
G. V. Conroy
Affiliation:
Department of Computation, University of Manchester Institute of Science and Technology, Manchester M60 1QD, UK

Abstract

This paper presents a survey of machine induction, studied mainly from the field of artificial intelligence, but also from the fields of pattern recognition and cognitive psychology. The paper consists of two parts: Part I discusses the basic principles and features of the machine induction process; Part II uses these principles and features to review and criticize the major supervised attribute-based induction methods. Attribute-based induction has been chosen because it is the most commonly used inductive approach in the development of expert systems and pattern recognition models.

Type
Research Article
Copyright
Copyright © Cambridge University Press 1991

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Amsterdam, J, 1988a. “Extending the Valiant learning model” In: Proceedings of 5th International Workshop on Machine Learning. Morgan-Kaufmann.Google Scholar
Amsterdam, J, 1988b. “Some philosophical problems with formal learning theory” In: Proceedings AAAl'88. Morgan-Kaufmann.Google Scholar
Anderberg, MR, 1973. Cluster Analysis for Applications. Academic Press, New York, NY.Google Scholar
Angluin, D and Laird, PD, 1988. “Learning from noisy examples” In: Machine Learning 2(4), 343370.CrossRefGoogle Scholar
Angluin, D and Smith, CH, 1983. “Inductive inference: theory and methodsACM Computing Surveys 15(3), 237269.Google Scholar
Arbab, B and Michie, D, 1985. “Generating rules from examples” In: Proceedings 9th IJCAI'85. Morgan-Kaufmann.Google Scholar
Baim, PW, 1988. “A method for attribute selection in inductive learning systemsIEEE Transactions on Pattern Analysis and Machine Intelligence 10(6), 888896.Google Scholar
Bierman, AW and Feldman, JA, 1972. “A survey of results in grammatical inference” In: Watanabe, S (ed.), Frontiers of Pattern Recognition. Academic Press, New York, NY.Google Scholar
Blumer, A, Ehrenfeucht, A, Haussler, D and Warmuth, M, 1986. “Classifying learnable geometric concepts with the Vapnik–Chervonenkis dimension” In: Proceedings 18th ACM Symposium. ACM.Google Scholar
Blumer, A, Ehrenfeucht, A, Haussler, D and Warmuth, M, 1987. “Occam's razorInformation Processing Letters 24(6), 377380.Google Scholar
Blythe, J, 1988. “Constraining search in a hierarchical discriminative learning system” In: Proceedings ECAI'88. Pitman.Google Scholar
Boucheron, S and Sallantin, J, 1988. “Learnability in the presence of noise” In: Proceedings EWSL'88. Pitman.Google Scholar
Breiman, L, Friedman, JH, Olshen, RA and Stone, CJ, 1984. “Classification and regression treesWadsworth Int. Group, Belmont, CA.Google Scholar
Buchanan, BG and Mitchell, TM, 1978. “Model directed learning of production rules” In: Waterman, DA and Hayes-Roth, F (eds.), Pattern Directed Inference Systems. Academic Press, New York, NY.Google Scholar
Bundy, A, Silver, D and Plummer, D, 1985. “An analytical comparison of some rule learning programsArtificial Intelligence 27(2), 137181.Google Scholar
Buntine, R, 1989. “A Critique of the Valiant model” In Proceedings 11th IJCAI-89. Morgan-Kaufmann.Google Scholar
Buntine, W and Stirling, KA, 1989. “Interactive induction” In: Hayes, J and Michie, D (eds.), Machine Intelligence 12. Oxford University Press, Oxford.Google Scholar
Carnap, R, 1950. The Logical Foundations of Probability. Chicago.Google Scholar
Carter, C and Catlett, J, 1987. “Assessing credit card applications using machine learningIEEE Expert 2(3), 7179.Google Scholar
Cestnik, B, Kononenko, I and Bratko, I, 1987. “ASSISTANT'86: a knowledge elicitation tool for sophisticated users” In: Bratko, I and Lavrac, N (eds.), Progress in Machine Learning. Sigma Press, Wilmslow.Google Scholar
Chan, PK, 1989. “Inductive learning with BCT” In: Proceedings 6th International Workshop on Machine Learning. Morgan-Kaufmann.Google Scholar
Chendrowska, J, 1987. “PRISM: an algorithm for inducing modular rulesInternational Journal of Man-Machine Studies 27(4), 349370.CrossRefGoogle Scholar
Chow, CK, 1957. “An optimum character recognition system using decision functionsIRE Transactions on Electronic Computers EC-6 (12 1957), 247254.CrossRefGoogle Scholar
Chrisman, L, 1989. “Extending the Valiant framework to detect incorrect bias” Technical Report, School of Computer Science, Carnegie Mellon University, CMU-CS-89–137.Google Scholar
Clark, P and Niblett, T, 1987. “Induction in noisy domains” In: Progress in Machine Learning. Sigma Press, Wilmslow.Google Scholar
Devijver, PA and Kittler, J, 1982. Pattern Recognition: A Statistical Approach. Prentice-Hall, London.Google Scholar
Dietterich, TG and Michalski, RS, 1985. “Discovering patterns in sequences of eventsArtificial Intelligence 25(2), 187232.CrossRefGoogle Scholar
Drastal, G, Meunier, R and Raatz, S, 1989. “Error correction in constructive induction” In: Proceedings 6th International Workshop on Machine Learning. Morgan-Kaufmann.Google Scholar
Duda, RO and Hart, PE, 1973. Pattern Classification and Scene Analysis. Wiley, Chichester.Google Scholar
Efron, B, 1983. “Estimating the error rate of a prediction ruleJournal of the American Statistical Association 78(382), 316331.Google Scholar
Feigenbaum, EA, 1981. “Expert systems in the 1980s” In: Bond, A (ed.), State of the Art Report on Machine Intelligence. Pergamon-Infotech, Maidenhead.Google Scholar
Fisher, DH, 1987. “Conceptual clustering, learning from examples and inference” In: Proceedings 4th International Conference on Machine Learning.Morgan-Kaufmann.CrossRefGoogle Scholar
Fu, KS, 1974. Syntactic methods in Pattern Recognition. Academic Press, New York. NY.Google Scholar
Gams, M and Lavrac, N, 1983. “Review of five empirical learning systems within a proposed schemata” In: Bratko, I and Lavrac, N (eds.), Progress in Machine Learning. Sigma Press, Wilmslow.Google Scholar
Ganascia, JG, 1987. “Learning with Hubert cubes” In: Bratko, I and Lavrac, N (eds.), Progress in Machine Learning. Sigma Press, Wilmslow.Google Scholar
Glick, N, 1978. “Additive estimators for probabilities of correct classificationPattern Recognition 10(3), 211222.Google Scholar
Gross, KP, 1988. “Incremental multiple concept learning using experiments” In: Proceedings 5th International Workshop on Machine Learning. Morgan-Kaufmann.Google Scholar
Haussler, D, 1987. “Bias, version spaces and the Valiant learning Framework” In: Proceedings 4th International Workshop on Machine Learning. Morgan-Kaufmann.Google Scholar
Haussler, D, 1988a. “Quantifying inductive bias: AI learning algorithms and the Valiant learning frameworkArtificial Intelligence 36(2), 177221.Google Scholar
Haussler, D, 1988b. “Space efficient learning algorithmsUniversity of California, Santa Cruz, UCSC–CRL–88–2.Google Scholar
Hayes-Roth, F, 1974. “Schematic classification problems and their solutionPattern Recognition 6(2), 105113.CrossRefGoogle Scholar
Highleyman, WH, 1962. “The design and analysis of pattern recognition experiments” Bell Systems Technical Journal. 03.CrossRefGoogle Scholar
Iba, W, Wogulis, J and Langley, P, 1988. “Trading off simplicity and coverage in incremental concept learning systems” In: Proceedings 5th International Workshop on Machine Learning. Morgan-Kaufmann.Google Scholar
Jain, AK and Dubes, R, 1978. “Feature definition in pattern recognition with small sample sizePattern Recognition 10(2), 8597.CrossRefGoogle Scholar
Kalkanis, G, 1989. “A proposal to enhance decision tree based inductive systems” MSc Dissertation, University of Manchester Institute of Science and Technology.Google Scholar
Kalkanis, G, 1991. “The application of confidence interval error analysis to the design of decision tree classifiers” (To appear in Pattern Recognition Letters.)Google Scholar
Kanal, LN and Chandrasekaran, B, 1971. “On dimensionality and sample size in statistic pattern recognitionPattern Recognition 3(3), 225234.Google Scholar
Kass, GV, 1980. “An exploratory technique for investigating large quantities of categorical dataApplied Statistics 29(2) 119127.CrossRefGoogle Scholar
Kearns, M, Li, M, Pitt, , Land Valiant, L, 1987. “Recent results on Boolean concept learning” In: Proceedings 4th International Workshop on Machine Learning. Morgan-Kaufmann.Google Scholar
Keynes, JM, 1921. A Treatise on Probability. Macmillan.Google Scholar
Kocabas, S, 1991. “Knowledge representation and learningKnowledge Engineering Review 6(3).Google Scholar
Kodratoff, Y, 1988. Introduction to Machine Learning. Pitman.Google Scholar
Langley, P, Simon, HA and Bradshaw, GL, 1984. “Heuristics for empirical discovery” Technical Report, Carnegie Mellon University Robotics Institute.Google Scholar
Lee, WD and Ray, SR, 1986. “Rule refinement using the probabilistic rule generator” In: Proceedings AAAI'86. Morgan-Kaufmann.Google Scholar
Leith, P, 1984. “Hierarchically structured production rulesThe Computer Journal 26(1), 15.CrossRefGoogle Scholar
Lenat, DB, 1983. “The role of heuristics in learning by discovery: three case studies” In: Michalski, RS, Carbonell, JG and Mitchell, TM (eds.), Machine Learning: An Artificial Intelligence Approach (vol. I) Tioga.Google Scholar
Lewis, PW, 1962. “The characteristic selection problem in recognition systems” IRE Transactions on Information Theory, IT–8 (02 1962), 171178.Google Scholar
Matheus, CJ and Rendell, LA, 1989. “Constructive induction on decision trees” In: Proceedings 11th IJCAI'89. Morgan-Kaufmann.Google Scholar
Medin, DI and Wattenmaker, WD, 1986. “Categories, cohesiveness, theories and cognitive archaeology” In: Concepts Reconsidered: The Ecological and Intellectual Bases of Categories. Neisser, U (ed.). Cambridge University Press.Google Scholar
Mervis, CB and Rosch, E, 1981. “Categorisation and natural objectsAnnual Review of Psychology, 32(2), 89115.CrossRefGoogle Scholar
Michalski, RS, 1973. “AQVAL/1—computer implementation of a variable-valued logic system and the application to pattern recognition” In: Proceedings 1st International Joint Conference on Pattern Recognition.Washington.Google Scholar
Michalski, RS, 1983. “A theory and methodology of inductive learning” In: Michalski, RS, Carbonell, JG and Mitchell, TM (eds.), Machine Learning: An Artificial Intelligence Approach (Vol I) Tioga.Google Scholar
Michalski, RS and Chilauski, RL, 1980. “Learning by being told and learning from examples: an experimental comparison of the two methods of knowledge acquisition in the context of developing an expert system for soyabean disease diagnosisInternational Journal of Policy Analysis and Information Systems 4(2), 125161.Google Scholar
Michalski, RS and Stepp, RE, 1983. “Learning from observation: conceptual clustering” In: Michalski, RS, Carbonell, JG and Mitchell, TM (eds.), Machine Learning: An Artificial Intelligence Approach (vol. I). Tioga.CrossRefGoogle Scholar
Michalski, RS, Carbonell, JG and Mitchell, TM, 1983. Machine Learning: An Artificial Intelligence Approach (vol. I) Tioga.CrossRefGoogle Scholar
Michalski, RS, Carbonell, JG and Mitchell, TM, 1986a. Machine Learning: An Artificial Intelligence Approach (vol. II) Morgan-Kaufman.Google Scholar
Michalski, RS, Mozetic, I, Hong, N and Lavrac, N, 1986b. “The multi-purpose incremental learning system AQ15 and its testing application to three medical domains” In: Proceedings AAAI'86. Morgan-Kaufmann.Google Scholar
Michie, D, 1988. “Machine learning in the next five years” In: Proceedings EWSL'88. Pitman.Google Scholar
Milosavljevic, A, 1988. “Learning in the presence of background knowledge” Technical Report, Univ. Santa Cruz, California, UCSR-CRL–87–27.Google Scholar
Mingers, J, 1989a. “An empirical comparison of pruning methods for decision tree inductionMachine Learning 3(3), 319342.CrossRefGoogle Scholar
Mingers, J, 1989b. “An empirical comparison of pruning methods for decision tree inductionMachine Learning 4(2), 227248.Google Scholar
Mitchell, TM, 1980. “The need for biases in learning generalisations” Technical Report CBM-TR–117, Department of Computer Science, Rutgers University.Google Scholar
Mitchell, TM, 1982. “Generalisation as search” In: Webber, BL and Nilsson, NJ (eds.), Readings in Artificial Intelligence. Troga Publishing Company.Google Scholar
Mortimer, H, 1988. The Logic of Induction Ellis Horwood.Google Scholar
Muggleton, S, 1987. “Structuring knowledge by asking questions” In: Bratko, I and Lavrac, N (eds.), Progress in Machine Learning. Sigma Press, Wilmslow.Google Scholar
Muggleton, S, 1988. “A strategy for constructing new predicates in first order logic” In: Proceedings EWSL'88. Pitman.Google Scholar
Muggleton, S, 1991. “Inductive logic programmingNew Generation Computing 8(4), 295318.CrossRefGoogle Scholar
Murray, KS, 1987. “Multiple convergence: an approach to disjunctive concept acquisition” In: Proceedings 10th IJCAI'87. Morgan-Kaufmann.Google Scholar
Neyman, J, 1957. “Inductive behaviour as a basic concept of philosophy and science” Review of the International Statistical Institute 25.Google Scholar
Nilsson, N, 1965. Learning Machines: Foundations of Trainable Pattern Classifying Systems. McGraw-Hill.Google Scholar
Pagallo, G, 1989. “Learning DNF by decision trees” In: Proceedings 11th IJCAI-89. Morgan-Kaufmann.Google Scholar
Pagallo, G and Haussler, D, 1988. “Feature discovery in empirical learningUniversity of California at Santa Cruz, Technical Report no. UCSC-CRL–88–08.Google Scholar
Payne, HJ and Meisel, WS, 1977. “An algorithm for constructing optimal binary decision trees”. IEEE Transactions on Computers C–26 (a) 09, 905916.CrossRefGoogle Scholar
Pearl, J, 1985. “Learning hidden causes from empirical data” In: Proceedings 9th IJCAI'85. Morgan-Kaufmann.Google Scholar
Plotkin, GD, 1971. “A further note on inductive generalisation” In: Machine Intelligence 6. Meltzer, B and Michie, D (eds.). Elsevier, New York.Google Scholar
Quinlan, JR, 1986. “Induction of decision trees” In: Machine Learning 7(1), 81106.Google Scholar
Quinlan, JR, 1987a. “Generating production rules from decision trees” In: Proceedings 10th IJCAI'87. Morgan-Kaufmann.Google Scholar
Quinlan, JR, 1987b. “Simplifying decision treesInternational journal of Man-Machine Studies 27(3), 221234.CrossRefGoogle Scholar
Quinlan, JR, 1989a. “Unknown attribute values in induction” In: Proceedings 6th International Workshop on Machine Learning. Morgan-Kaufmann.Google Scholar
Quinlan, JR, 1989b. “Learning relations: comparison of a symbolic and a connectionist approach” Technical Report 346, University of Sydney.Google Scholar
Quinlan, JR, Compton, PJ, Horn, KA and Lazarus, L, 1986. “Inductive knowledge acquisition: a case study” In: Proceedings Second Australian Conference on Applications of Expert Systems.Sydney.Google Scholar
Rappaport, AT and Gaines, BR, 1990. “Integrated knowledge base building environmentsKnowledge Acquisition 2(1), 5172.Google Scholar
Reinke, RE and Michalski, RS, 1988. Incremental learning of concept descriptions: a method and experimental results” In: Hay, JE and Michie, D (eds.), Machine Intelligence 11. Oxford Press.Google Scholar
Rendell, L, 1986. “A general framework for induction and a study of selective induction” In: Machine Learning 1(2), 177226.Google Scholar
Rendell, L, 1988. “Learning hard concepts through constructive induction: framework and rationale” Technical Report, Univ. of Illinois at Urbana Champaign, UIUCDS-R- 88–1426.Google Scholar
Rendell, L, Seshu, R and Cheng, D, 1987. “More robust concept learning using dynamically variable bias management” In: Proceedings 10th IJCAI'87. Morgan-Kaufmann.Google Scholar
Schlimmer, JC and Fisher, D, 1986. “A case study of incremental concept induction” In: Proceedings AAAI'86. Morgan-Kaufmann.Google Scholar
Schlimmer, JC, 1987. “Incremental adjustment of representations for learning” In: Proceedings 4th International Workshop on Machine Learning. Morgan-Kaufmann.Google Scholar
Sebestyen, GS, 1962. “Pattern recognition by an adaptive process of sample set constructionIRE Transactions on Information Theory IT–8 09, 8291.Google Scholar
Shapiro, AD, 1987. Structured Induction For Expert Systems Turing Institute Press, Addison-Wesley.Google Scholar
Thornton, C, 1987. “Hypercuboid-formation behaviour of two learning algorithms” In: Proceedings 10th IJCAl'87. Morgan-Kaufmann.Google Scholar
Toussaint, GT, 1974. “Bibliography on estimation of misclassificationIEEE Transactions on Information Theory IT–20 (4) 07, 472479.Google Scholar
Utgoff, PE, 1986. Machine Learning of Inductive Bias Kluwer.CrossRefGoogle Scholar
Utgoff, PE, 1988. “ID5: an incremental IDS” In: Proceedings 5th International Workshop on Machine Learning. Morgan-Kaufmann.Google Scholar
Valiant, L, 1984. “A theory of the learnableCommunications of the ACM 27(11), 11341142.CrossRefGoogle Scholar
Vapnik, VN and Chervonenkis, AY, 1971. “On the uniform convergence of realative frequencies of events to their probabilitiesTheory of Probability and its Applications 16(2), 264280.Google Scholar
Vapnik, VN, 1989. “Inductive principles for the search for empirical dependencies (methods based on weak convergence of probability measures)” In: Proceedings Second Annual Workshop on Computational Learning Theory. Santa Cruz. CA.Morgan-Kaufmann.Google Scholar
Watanabe, S, 1972. “Pattern recognition as information compression” In: Watanabe, S (ed.), Frontiers of Pattern Recognition. Academic Press.Google Scholar
Watanabe, S, 1985. Pattern Recognition: Human and Mechanical Wiley.Google Scholar
Wilkins, DC and Buchanan, BG, 1986. “On debugging rule-sets when reasoning under uncertainty” In: Proceedings AAAI'86. Morgan-Kaufmann.Google Scholar
Winston, PH, 1975. The Psychology of Computer Vision McGraw-Hill.Google Scholar
Wirth, J and Catlett, J, 1988. “Experiments on the costs and benefits of windowing in ID3” In: Proceedings 5th International Workshop on Machine Learning. Morgan-Kaufmann.Google Scholar