Constrained Classification: The use of a Priori Information in Cluster Analysis

Wayne S. DeSarbo; Vijay Mahajan

doi:10.1007/BF02294172

Constrained Classification: The use of a Priori Information in Cluster Analysis

Published online by Cambridge University Press: 01 January 2025

Wayne S. DeSarbo and

Vijay Mahajan

Show author details

Wayne S. DeSarbo*: Affiliation:
Bell Laboratories
Vijay Mahajan: Affiliation:
Southern Methodist University
*: Requests for reprints should be sent to Wayne DeSarbo, Bell Laboratories, Room 2C-256, 600 Mountain Avenue, Murray Hill, NJ 07974

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

In many classification problems, one often possesses external and/or internal information concerning the objects or units to be analyzed which makes it appropriate to impose constraints on the set of allowable classifications and their characteristics. CONCLUS, or CONstrained CLUStering, is a new methodology devised to perform constrained classification in either an overlapping or nonoverlapping (hierarchical or nonhierarchial) manner. This paper initially reviews the related classification literature. A discussion of the use of constraints in clustering problems is then presented. The CONCLUS model and algorithm are described in detail, as well as their flexibility for use in various applications. Monte Carlo results are presented for two synthetic data sets with appropriate discussion of the resulting implications. An illustration of CONCLUS is presented with respect to a sales territory design problem where the objects classified are various Forbes-500 companies. Finally, the discussion section highlights the main contribution of the paper and offers some areas for future research.

Keywords

Cluster Analysis Constrained Optimization

Information

Type: Original Paper
Information: Psychometrika , Volume 49 , Issue 2 , June 1984 , pp. 187 - 215

DOI: https://doi.org/10.1007/BF02294172 [Opens in a new window]
Copyright: Copyright © 1984 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

We wish to thank C. Mallows and J. D. Carroll for some helpful technical discussion and L. Clark and D. Art for their valuable computer assistance. We also wish to thank H. Pollak, R. Gnanadesikan, and J. Kettenring for their thorough reviews of a previous draft of this paper. Finally, we acknowledge helpful comments of the editor and two anonymous reviewers.

References

Reference Notes

Carroll, J. D. and Pruzansky, S. (1975). Fitting of hierarchical tree structure (HTS) models, mixtures of HTS models, and hybrid models, via mathematical programming and alternating least squares, Murray Hill, N.J.: Bell Laboratories.Google Scholar

Carroll, J. D. and Arabie, P. (1979). INDCLUS: A three-way approach to clustering. Presented at the Meeting of the Psychometric Society, Monterey, Cal.Google Scholar

DeSarbo, W. S., Carroll, J. D., Clark, L. A., and Green, P. E. (1982). Synthesized Clustering: A method for amalgamating alternative clustering bases with differential weighting of variables, Working Paper, Bell Laboratories.Google Scholar

Carroll, J. D., and Chang, J. J. (1972). IDIOSCAL: A generalization of INDSCAL allowing IDIOsyncratic reference systems as well as an analytic approximation to INDSCAL. Paper presented at the meetings of the Psychometric Society, Princeton, N.J.Google Scholar

Fowlkes, E. B., Gnanadesikan, R., and Kettenring, J. R. (1982). Variable selection in clustering, Murray Hill, N.J.: Bell Laboratories.Google Scholar

Harshman, R. A. (1972). Determination and proof of minimum uniqueness conditions for PARAFAC 1. U.C.L.A. Working Papers in Phonetics, 22.Google Scholar

Klastorin, T. D. (1973). A clustering approach to systems design, Unpublished Manuscript, University of Texas at Austin.Google Scholar

Mallows, C. L. (1982). Personal communication.Google Scholar

Perruchet, C. (1979). Classification sous constrainte de contiguite continue (Application aux sciences de la terre). Thesis, Paris.Google Scholar

References

Arabie, P., Carroll, J. D., DeSarbo, W. S., and Wind, Y. (1981). Overlapping clustering: A new method for product positioning. Journal of Marketing Research, 18, 310–317.CrossRef Google Scholar

Art, D., Gnanadesikan, R., and Kettenring, J. R. (1982). Data-based metrics for cluster analysis. Utilitas Mathematica, 21A, 77–99.Google Scholar

Arthanari, T. S. and Dodge, Y. (1981). Mathematical Programming in Statistics, New York: Wiley and Sons.Google Scholar

Balas, E. and Padberg, M. (1975). Set partitioning. In Roy, B. (Eds.), Combinatorial Programming: Methods and Applications (pp. 205–258). Boston, Mass.: D. Reidel.CrossRef Google Scholar

Carroll, J. D. and Arabie, P. (1983). INDCLUS: An individual differences generalization of the ADCLUS model and the MAPCLUS algorithm. Psychometrika, 48, 157–169.CrossRef Google Scholar

Carroll, J. D. and Wish, M. (1974). Models and methods for three-way multidimensional scaling. In Krantz, D. H., Atkinson, R. C., Luce, R. D. and Suppes, P. (Eds.), Contemporary Developments in Mathematical Psychology, San Francisco: W. H. Freeman and Co..Google Scholar

Christofides, N. and Brooker, P. (1976). The optimal partitioning of graphs. SIAM Journal on Applied Math, 30, 55–69.CrossRef Google Scholar

Cormack, R. M. (1971). A review of classification. Journal of the Royal Statistical Society: Series A, 134, 321–367.CrossRef Google Scholar

Courant, R. (1965). Differential and Integral Calculus 2nd edition,, New York: Wiley and Sons.Google Scholar

Cunningham, J. P. (1978). Free trees and bidirectional trees as representation of psychological distance. Journal of Mathematical Psychology, 17, 165–188.CrossRef Google Scholar

DeSarbo, W. S. (1982). GENNCLUS: New models for general nonhierarchical clustering analysis. Psychometrika, 47, 449–476.CrossRef Google Scholar

Faegri, K. and Iverson, J. (1975). Textbook of Pollen Analysis 3rd ed.,, Oxford, England: Blackwell Scientific Publications.Google Scholar

Ferligoj, A. and Batagelj, V. (1982). Clustering with relational constraint. Psychometrika, 47, 413–426.CrossRef Google Scholar

Ferligo, A. and Batagelj, V. (1983). Some types of clustering with relational constraints. Psychometrika, 48, 541–552.CrossRef Google Scholar

Fitzroy, P. T. (1976). Analytical Methods for Marketing Management, New York: McGraw Hill.Google Scholar

Fletcher, R. and Reeves, C. M. (1964). Function minimization by conjugate gradients. Computer Journal, 7, 149–154.CrossRef Google Scholar

Frank, R. E., Massey, W. F., and Wind, Y. (1972). Market Segmentation, Englewood Cliffs, N.J.: Prentice Hall, Inc..Google Scholar

Gill, P. E., Murray, W., and Wright, M. H. (1981). Practical Optimization, New York: Academic Press.Google Scholar

Gitman, I., and Levine, M. D. (1970). An algorithm for detecting unimodal fuzzy sets and its application as a clustering technique. IEEE Trans. Comp, C19, 583–593.CrossRef Google Scholar

Gnanadesikan, R. (1977). Methods for Statistical Data Analysis of Multivariate Observations, New York: Wiley and Sons.Google Scholar

Gordon, A. D. (1973). Classification in the presence of constraints. Biometrics, 29, 821–827.CrossRef Google Scholar

Gordon, A. D. (1980). Methods of constrained classification, In Analyse de Donńes et Informatique, Tomassone, R. (ed.), I.R.I.A., de Chesnay, 161–171.Google Scholar

Gordon, A. D. (1981). Classification, New York: Chapman and Hall.Google Scholar

Hartigan, J. A. (1967). Representation of similarity matrices by trees. Journal of the American Statistical Association, 62, 1140–1158.CrossRef Google Scholar

Hartigan, J. A. (1975). Clustering Algorithms, New York: Wiley and Sons.Google Scholar

Helbig, R. E., Orr, P. K. and Roediger, R. R. (1972). Political redistricting by computer. Comm. ACM, 15, 735–741.CrossRef Google Scholar

Himmelblau, D. M. (1972). Applied Nonlinear Programming, New York: McGraw Hill, Inc..Google Scholar

Jarvinen, P., Rajala, J. and Sinervo, H. (1972). A branch-and-bound algorithm for seeking thep-median. Operations Research, 20, 173–178.CrossRef Google Scholar

Jensen, R. E. (1969). A dynamic programming algorithm for cluster analysis. Operations Research, 12, 1034–1057.CrossRef Google Scholar

Johnson, S. C. (1967). Hierarchical clustering schemes. Psychometrika, 32, 241–254.CrossRef Google Scholar PubMed

Kernighan, B. W. (1971). Optimal sequential partition of graphs. Journal of Assoc. Comp. Mach., 18, 34–40.CrossRef Google Scholar

Klastorin, T. D. and Watts, C. A. (1981). The determination of alternative hospital classifications. Health Services Research, 16, 205–220.Google Scholar PubMed

Klastorin, T. D. (1982). An alternative method for hospital partition determination using hierarchical cluster analysis. Operations Research, 30, 1134–1147.CrossRef Google Scholar

Kruskal, J. B. (1964). Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29, 1–27.CrossRef Google Scholar

Kruskal, J. B. (1964). Nonmetric multidimensional scaling: A numerical method. Psychometrika, 29, 115–219.CrossRef Google Scholar

Kruskal, J. B. and Carroll, J. D. (1969). Geometrical models and badness-of-fit functions. In Krishnaiah, P. R. (Eds.), Multivariate Analysis III (pp. 639–670). New York: Academic Press.Google Scholar

Kruskal, J. B. (1972). Linear transformations of multivariate data to reveal clustering. In Shepard, R. N., Romney, A. K. and Nerlove, S. B. (Eds.), Multidimensional Scaling: Theory and Applications in the Behavioral Sciences (pp. 181–191). New York: Seminar Press.Google Scholar

Lawson, C. L. and Hanson, R. J. (1974). Solving Least Squares Problems, Englewood Cliffs, N.J.: Prentice-Hall, Inc..Google Scholar

Lechevallier, Y. (1980). Classification sous contraintes. In Diday, E. et al. (Eds.), Optimisation en Classification Automatique (pp. 677–696). Paris: INRIA.Google Scholar

Lebart, L. (1978). Programme d'Agrégation avec contraintes (C.A.H. contiguité). Les Cahiers de l'Analyse des Données, 3, 275–287.Google Scholar

Lefkovitch, L. P. (1978). Cluster generation and grouping using mathematical programming. Mathematical Biosciences, 41, 91–110.CrossRef Google Scholar

Lefkovitch, L. P. (1980). Conditional clustering. Biometrics, 36, 43–58.CrossRef Google Scholar

Lin, S. and Kernighan, B. W. (1973). An effective heuristic algorithm for the traveling salesman problem. Operations Research, 21, 489–516.CrossRef Google Scholar

Littschwager, J. M. and Wang, C. (1978). Integer programming solution of a classification problem. Management Science, 24, 151–165.CrossRef Google Scholar

Lukes, J. A. (1974). Efficient algorithm for the partitioning of trees. IBM Journal of Research and Development, 18, 217–224.CrossRef Google Scholar

Lukes, J. A. (1975). Combinatorial solution to the partitioning of general graphs. IBM Journal of Research and Development, 19, 170–180.CrossRef Google Scholar

MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In LeCam, L. M. and Neyman, J. (Eds.), Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability (pp. 281–297). Berkely, California: University of California Press.Google Scholar

Mahajan, V., and Jain, A. K. (1978). An approach to normative segmentation. Journal of Marketing Research, 15, 338–345.CrossRef Google Scholar

Marsten, R. E. (1974). An algorithm for large set partitioning problems. Management Science, 20, 774–787.CrossRef Google Scholar

Michaels, J. W. (1982). Forbes (pp. 254–293). New York: Forbes, Inc..Google Scholar

Mills, G. (1967). The determination of local government electoral boundaries. Operations Research Quarterly, 18, 243–255.CrossRef Google Scholar

Mulvey, J. M., and Crowder, H. P. (1979). Cluster analysis: An application of lagrangian relaxation. Management Science, 25, 329–341.CrossRef Google Scholar

Openshaw, S. (1977). A geographical solution to scale and aggregation problems in region-building, partitioning, and spatial modelling. Trans. Inst. Brit. Geog., N52, 459–472.CrossRef Google Scholar

Rao, M. R. (1971). Cluster analysis and mathematical programming. Journal of the American Statistical Association, 66, 622–626.CrossRef Google Scholar

Rao, S. S. (1979). Optimization: Theory and Applications, New York: Wiley and Sons.Google Scholar

Sattath, S. and Tversky, A. (1977). Additive similarity trees. Psychometrika, 42, 319–345.CrossRef Google Scholar

Shepard, R. N. and Arabie, P. (1979). Additive clustering: representation of similarities as combination of discrete overlapping properties. Psychological Review, 86, 87–123.CrossRef Google Scholar

Taylor, P. J. (1973). Some implications of the spatial organization of elections. Trans. Inst. Brit. Geog., 60, 121–136.CrossRef Google Scholar

Torgerson, W. S. (1958). Theory and Methods of Scaling, New York: Wiley and Sons.Google Scholar

Tucker, L. R. (1972). Relations between multidimensional scaling and three-mode factor analysis. Psychometrika, 37, 3–27.CrossRef Google Scholar

Vinod, H. D. (1969). Integer programming and the theory of groups. Journal of the American Statistical Association, 64, 506–519.CrossRef Google Scholar

Webster, R. and Burrough, P. A. (1972). Computer-based soil mapping of small areas from sample data. I. Multivariate classification and ordination. Journal of Soil Science, 23, 210–221.CrossRef Google Scholar

Webster, R. and Burrough, P. A. (1972). Computer-based soil mapping of small area from sample data. II. Classification smoothing. Journal of Soil Science, 3, 222–234.CrossRef Google Scholar

Wind, Y. (1982). Product Policy: Concepts, Methods and Strategy, Reading Mass.: Addison-Wesley.Google Scholar

Winter, F. W. (1979). A Cost-benefit aproach to market segmentation. Journal of Marketing, 43, 103–111.CrossRef Google Scholar

Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8, 338–353.CrossRef Google Scholar

Article contents

Constrained Classification: The use of a Priori Information in Cluster Analysis

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Reference Notes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests