Hostname: page-component-745bb68f8f-5r2nc Total loading time: 0 Render date: 2025-01-07T17:27:19.035Z Has data issue: false hasContentIssue false

Some Applications of Graph Theory to Clustering

Published online by Cambridge University Press:  01 January 2025

Lawrence J. Hubert*
Affiliation:
University of Wisconsin

Abstract

This paper attempts to review and expand upon the relationship between graph theory and the clustering of a set of objects. Several graphtheoretic criteria are proposed for use within a general clustering paradigm as a means of developing procedures “in between” the extremes of complete-link and single-link hierarchical partitioning; these same ideas are then extended to include the more general problem of constructing subsets of objects with overlap. Finally, a number of related topics are surveyed within the general context of reinterpreting and justifying methods of clustering either through standard concepts in graph theory or their simple extensions.

Type
Original Paper
Copyright
Copyright © 1974 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abraham, C. T. Techniques for thesaurus organization and evaluation. In Kochen, M. (Eds.), Some problems in information science. New York: The Scarecrow Press. 1965, 131150Google Scholar
Abraham, C. T. Graph theoretic techniques for the organization of linked data. In Kochen, M. (Eds.), Some problems in information science. New York: The Scarecrow Press. 1965, 229264Google Scholar
Anderson, S. S.. Graph theory and finite combinatorics, 1970, Chicago: MarkhamGoogle Scholar
Augustson, J. C. and Minker, J. An analysis of some graph theoretical cluster techniques. Journal of the Association for Computing Machinery, 1970, 17, 571588CrossRefGoogle Scholar
Augustson, J. C. and Minker, J. Deriving term relations for a corpus by graph theoretical clusters. Journal of the American Society for Information Science, 1970, 21, 101111CrossRefGoogle Scholar
Bohisud, H. M. and Bohisud, L. E. A metric for classifications. Taxon, 1972, 21, 607613CrossRefGoogle Scholar
Bonner, R. E. On some clustering techniques. IBM Journal, 1964, 8, 2232CrossRefGoogle Scholar
Boorman, S. A. and Arabie, P. Structural measures and the method of sorting. In Shepard, R. N., Romney, A. K., Nerlove, S. B. (Eds.), Multidimensional scaling—Volume I. New York: Seminar Press. 1972, 225249Google Scholar
Boorman, S. A. and Olivier, D. C.. Metrics on spaces of finite trees. Journal of Mathematical Psychology, 1973, 10, 2659CrossRefGoogle Scholar
Busacker, R. G. and Saaty, T. L. Finite graphs and networks, 1965, New York: McGraw-HillGoogle Scholar
Cattell, R. B. and Coulter, M. A. Principles of behavioral taxonomy and the mathematical basis of the taxonome computer program. The British Journal of Mathematical and Statistical Psychology, 1966, 19, 237269CrossRefGoogle ScholarPubMed
Chabot, J. A simplified example of the use of matrix multiplication for the analysis of sociometric data. Sociometry, 1950, 13, 131140CrossRefGoogle Scholar
Clark, J. A. and McQuitty, L. L. Some problems and elaborations of interactive intercolumnar correlational analysis. Educational and Psychological Measurement, 1970, 30, 773784CrossRefGoogle Scholar
Cole, A. J. and Wishart, D. An improved algorithm for the Jardine-Sibson method of generating overlapping clusters. The Computer Journal, 1970, 13, 156163CrossRefGoogle Scholar
Constantinescu, P. The classification of a set of elements with respect to a set of properties. The Computer Journal, 1966, 8, 352357CrossRefGoogle Scholar
Constantinescu, P.. A method of cluster analysis. The British Journal of Mathematical and Statistical Psychology, 1967, 20, 93106CrossRefGoogle Scholar
Cormack, R. M.. A review of classification. Journal of the Royal Statistical Society—Series A, 1971, 134, 321367CrossRefGoogle Scholar
Doreian, P. A note on the detection of cliques in valued graphs. Sociometry, 1969, 32, 237242CrossRefGoogle Scholar
Erdös, P. and Rényi, A. On the evolution of random graphs. Publications of the Mathematical Institute of the Hungarian Academy of Sciences, 1960, 5, 1761Google Scholar
Estabrook, G. F. A mathematical model in graph theory for biological classification. Journal of Theoretical Biology, 1966, 12, 297310CrossRefGoogle ScholarPubMed
Festinger, L. The analysis of sociograms using matrix algebra. Human Relations, 1949, 2, 153158CrossRefGoogle Scholar
Fillenbaum, S. and Rapoport, A. Structures in the subjective lexicon, 1971, New York: Academic PressGoogle Scholar
Ford, L. R. and Fulkerson, D. R. Flows in networks, 1962, Princeton: Princeton University PressGoogle Scholar
Gorinshteyn, L. L. The partitioning of graphs. Engineering Cybernetics, 1969, 1, 7682Google Scholar
Gotlieb, C. C. and Kumar, S. Semantic clustering of index terms. Journal of the Association for Computing Machinery, 1968, 15, 493513Google Scholar
Gower, J. C. Comparison of some methods of cluster analysis. Biometrics, 1967, 23, 623637CrossRefGoogle ScholarPubMed
Gower, J. C. and Ross, G. J. S. Minimum spanning trees and single linkage cluster analysis. Applied Statistics, 1969, 18, 5464CrossRefGoogle Scholar
Harary, F. A graph theoretic approach to similarity relations. Psychometrika, 1964, 29, 143151CrossRefGoogle Scholar
Harary, F. Graph theory, 1969, Reading, Mass.: Addison-WesleyCrossRefGoogle Scholar
Harary, F. Graph theory as a structural model in the social sciences. In Harris, B. (Eds.), Graph theory and its applications. New York: Academic Press. 1970, 116Google Scholar
Harary, F., Norman, R. Z. and Cartwright, D. Structural models: An introduction to the theory of directed graphs, 1965, New York: WileyGoogle Scholar
Harary, F. and Ross, I. C. A procedure for clique detection using the group matrix. Sociometry, 1957, 20, 205215CrossRefGoogle Scholar
Harrison, I. Cluster analysis. Metra, 1968, 7, 513528Google Scholar
Hartigan, J. A. Representation of similarity matrices by trees. Journal of the American Statistical Association, 1967, 62, 11401158CrossRefGoogle Scholar
Hubert, L. Some extensions of Johnson's hierarchical clustering algorithms. Psychometrika, 1972, 37, 261274CrossRefGoogle Scholar
Hubert, L. Monotone invariant clustering procedures. Psychometrika, 1973, 38, 4762CrossRefGoogle Scholar
Hubert, L. Min and max hierarchical clustering using asymmetric similarity measures. Psychometrika, 1973, 38, 6372CrossRefGoogle Scholar
Hubert, L. Approximate evaluation techniques for the single-link and complete-link hierarchical clustering procedures. Journal of the American Statistical Association, 1974, 69, in press. (a)CrossRefGoogle Scholar
Hubert, L. Spanning trees and aspects of clustering. British Journal of Mathematical and Statistical Psychology, 1974, in press. (b)CrossRefGoogle Scholar
Hubert, L. and Schultz, J. The approximate sampling distribution for the minimum number of lines in a connected random graph. Journal of Statistical Computation and Simulation, 1974, in press.Google Scholar
Jardine, N.. Towards a general theory of clustering. Biometrics, 1969, 25, 609610Google Scholar
Jardine, N. Algorithms, methods and models in the simplification of complex data. The Computer Journal, 1970, 13, 116117CrossRefGoogle Scholar
Jardine, N. A new approach to pattern recognition. Nature, 1971, 234, 526528CrossRefGoogle Scholar
Jardine, N. and Sibson, R. A model for taxonomy. Mathematical Biosciences, 1968, 2, 465482CrossRefGoogle Scholar
Jardine, N. and Sibson, R. The construction of hierarchic and non-hierarchic classifications. The Computer Journal, 1968, 11, 177184CrossRefGoogle Scholar
Jardine, N. and Sibson, R. Mathematical taxonomy, 1971, New York: WileyGoogle Scholar
Johnson, S. C. Hierarchical clustering schemes. Psychometrika, 1957, 32, 241254CrossRefGoogle Scholar
Kruskal, J. B. On the shortest spanning subtree of a graph and the traveling salesman problem. Proceedings of the American Mathematical Society, 1956, 7, 4850CrossRefGoogle Scholar
Lance, G. N. and Williams, W. T. A general theory of classifactory sorting strategies I. Hierarchical systems. The Computer Journal, 1967, 10, 373380CrossRefGoogle Scholar
Lance, G. N., Williams, W. T.. A general theory of classifactory sorting strategies II. Clustering systems. The Computer Journal, 1967, 10, 271277CrossRefGoogle Scholar
Legendre, P., Rogers, D. J.. Characters and clustering in taxonomy: A synthesis of two taximetric procedures. Taxon, 1972, 21, 567606CrossRefGoogle Scholar
Lerman, I. C.. Les bases de la classification automatique, 1970, Paris: Gauthier-VillarsGoogle Scholar
Levandowsky, M., Winter, D.. Distance between sets. Nature, 1971, 234, 3435CrossRefGoogle Scholar
Ling, R. F.. On the theory and construction of k-clusters. The Computer Journal, 1972, 15, 326332CrossRefGoogle Scholar
Ling, R. F.. A probability theory of cluster analysis. Journal of the American Statistical Association, 1973, 68, 159164CrossRefGoogle Scholar
Luce, R. D.. Connectivity and generalized cliques in sociometric group structure. Psychometrika, 1950, 15, 169190CrossRefGoogle ScholarPubMed
Luce, R. D.. Two decomposition theorems for a class of finite oriented graphs. American Journal of Mathematics, 1952, 74, 701722CrossRefGoogle Scholar
Luce, R. D.. Networks satisfying minimality conditions. American Journal of Mathematics, 1953, 75, 825838CrossRefGoogle Scholar
Luce, R. D., Perry, A. D.. A method of matrix analysis of group structure. Psychometrika, 1949, 14, 95116CrossRefGoogle ScholarPubMed
Marshall, C. W.. Applied graph theory, 1971, New York: WileyGoogle Scholar
Matula, D. W.. Cluster analysis via graph theoretic techniques. In Mullin, R. C., Reid, K. B., Roselle, D. P. (Eds.), Proceedings of the Louisiana Conference on combinatorics, graph theory, and computing. Winnipeg: University of Manitoba. 1970, 199212Google Scholar
Matula, D. W.. k-components, clusters and slicings in graphs. SIAM Journal of Applied Mathematics, 1972, 22, 459480CrossRefGoogle Scholar
Matula, D. W., Marble, G., Isaacson, J. D.. Graph coloring algorithms. In Read, R. C. (Eds.), Graph theory and computing. New York: Academic Press. 1972, 109122CrossRefGoogle Scholar
Menger, K.. Zur allgemeinen Kurventheorie. Fundamenta Mathematicae, 1927, 10, 96115CrossRefGoogle Scholar
McQuitty, L. L.. Elementary linkage analysis for isolating orthogonal and oblique types and typal relevancies. Educational and Psychological Measurement, 1957, 17, 207229CrossRefGoogle Scholar
McQuitty, L. L.. Typal analysis. Educational and Psychological Measurement, 1961, 21, 677697CrossRefGoogle Scholar
McQuitty, L. L.. Elementary factor analysis. Psychological Reports, 1961, 9, 7178CrossRefGoogle Scholar
McQuitty, L. L.. Rank order typal analysis. Educational and Psychological Measurement, 1963, 23, 5561CrossRefGoogle Scholar
McQuitty, L. L.. Capabilities and improvements of linkage analysis as a clustering method. Educational and Psychological Measurement, 1964, 24, 441456CrossRefGoogle Scholar
McQuitty, L. L.. A mutual development of some typological theories and pattern-analytic methods. Educational and Psychological Measurement, 1967, 27, 2148CrossRefGoogle Scholar
McQuitty, L. L., Clark, J. A.. Clusters from iterative intercolumnar correlational analysis. Educational and Psychological Measurement, 1968, 28, 211238CrossRefGoogle Scholar
Mulligan, G. C., Corneil, D. G.. Corrections to Bierstone's algorithm for generating cliques. Journal of the Association for Computing Machinery, 1972, 19, 244247CrossRefGoogle Scholar
Needham, R. M.. The theory of Clumps II. Report Number 139, 1961, Cambridge, England: Cambridge Language Research UnitGoogle Scholar
Ogilvie, J. C.. The distribution of number and size of connected components in random graphs of medium size. In Morrell, A. J. H. (Eds.), Information processing: 68. Amsterdam: North Holland Publishing Co.. 1969, 15271530Google Scholar
Ore, O.. Theory of graphs, 1962, Providence: American Mathematical SocietyCrossRefGoogle Scholar
Ore, O.. Graphs and their use, 1963, New York: Random HouseGoogle Scholar
Overall, J. E.. A configural analysis of psychiatric diagnostic stereotypes. Behavioral Science, 1963, 8, 211219CrossRefGoogle Scholar
Overall, J. E., Klett, C. J.. Applied multivariate analysis, 1972, New York: McGraw-HillGoogle Scholar
Parker-Rhodes, A. F.. Contributions to the theory of clumps: The usefulness and feasibility of the theory. Report Number 137, 1961, Cambridge, England: Cambridge Language Research UnitGoogle Scholar
Parker-Rhodes, A. F., Needham, R. M.. The theory of clumps. Report Number 126, 1961, Cambridge, England: Cambridge Language Research UnitGoogle Scholar
Peay, E. R.. An interactive clique detection procedure. Ann Arbor, Michigan: Michigan Mathematical Psychology Program. 1970, 7074 (a)Google Scholar
Peay, E. R.. Nonmetric grouping: Clusters and cliques. Ann Arbor, Michigan: Michigan Mathematical Psychology Program. 1970, 7075 (b)Google Scholar
Prim, R. C.. Shortest connection networks and some generalizations. Bell System Technical Journal, 1957, 36, 13891401CrossRefGoogle Scholar
Restle, F.. A metric and an ordering on sets. Psychometrika, 1959, 24, 207219CrossRefGoogle Scholar
Rose, M. J.. Classification of a set of elements. The Computer Journal, 1964, 7, 208210CrossRefGoogle Scholar
Ross, G. J. S.. Classification techniques for large sets of data. In Cole, A. J. (Eds.), Numerical taxonomy. New York: Academic Press. 1969, 224233Google Scholar
Ross, I. C., Harary, F.. On the determination of redundancies in sociometric chains. Psychometrika, 1952, 17, 195208CrossRefGoogle Scholar
Ross, I. C., Harary, F.. Identification of the liaison persons of an organization using the structure matrix. Management Science, 1955, 1, 251258CrossRefGoogle Scholar
Ross, I. C., Harary, F.. A description of strenghtening and weakening group members. Sociometry, 1959, 22, 139147CrossRefGoogle Scholar
Roy, D.. An algorithm for a general constrained set covering problem. In Read, R. C. (Eds.), Graph theory and computing. New York: Academic Press. 1972, 267283CrossRefGoogle Scholar
Schultz, J., Hubert, L.. Data analysis and the connectivity of random graphs. Journal of Mathematical Psychology, 1973, 10, 421428CrossRefGoogle Scholar
Shepard, R. N.. A taxonomy of some principal types of data and of multidimensional methods for their analysis. In Shepard, R. N., Romney, A. K., Nerlove, S. B. (Eds.), Multidimensional scaling-Volume I. New York: Seminar Press. 1972, 2147Google Scholar
Shepherd, M. J., Willmott, A. J.. Cluster analysis on the Atlas computer. The Computer Journal, 1968, 11, 5662CrossRefGoogle Scholar
Sibson, R.. Some observations on a paper by Lance and Williams. The Computer Journal, 1971, 14, 156157CrossRefGoogle Scholar
Sparck-Jones, K.. Automatic keyword classification for information retrieval, 1971, London: ButterworthsGoogle Scholar
Tryon, R. C., Bailey, D. E.. The BCTRY computer system of cluster and factor analysis. Multivariate Behavioral Research, 1966, 1, 95111CrossRefGoogle Scholar
Tutte, W. T.. The connectivity of graphs, 1967, Toronto: Toronto University PressGoogle Scholar
Van Rijsbergen, C. J.. A clustering algorithm. The Computer Journal, 1970, 13, 113115Google Scholar
Vaswani, P. K. T.. A technique for cluster emphasis and its application to automatic indexing. In Morrell, A. J. H. (Eds.), Information processing: 68. Amsterdam: North Holland Publishing Co.. 1969, 13001303Google Scholar
Weiss, R. S., Jacobson, E.. A method for the analysis of the structure of complex organizations. American Sociological Review, 1955, 20, 661668CrossRefGoogle Scholar
Whitney, H.. Congruent graphs and the connectivity of graphs. American Journal of Mathematics, 1932, 54, 150168CrossRefGoogle Scholar
Williams, W. T., Lance, G. N., Dale, M. B., Clifford, H. T.. Controversy concerning the criteria for taxonometric strategies. The Computer Journal, 1971, 14, 162165CrossRefGoogle Scholar
Wirth, M., Estabrook, G. F., Rogers, D. J.. A graph theory model for systematic biology, with an example for the Oncidiinae (Orchidaceae). Systematic Zoology, 1966, 15, 5969CrossRefGoogle Scholar
Wishart, D.. A generalization of nearest neighbor which reduces chaining effects. In Cole, A. J. (Eds.), Numerical taxonomy. New York: Academic Press. 1969, 282311Google Scholar
Zahn, C. T.. Graph-theoretical methods for detecting and describing Gestalt clusters. IEEE Transactions on Computers, 1971, C-20, 6886CrossRefGoogle Scholar