Some Applications of Graph Theory to Clustering

Lawrence J. Hubert

doi:10.1007/BF02291704

Some Applications of Graph Theory to Clustering

Published online by Cambridge University Press: 01 January 2025

Lawrence J. Hubert

Show author details

Lawrence J. Hubert*: Affiliation:
University of Wisconsin

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

This paper attempts to review and expand upon the relationship between graph theory and the clustering of a set of objects. Several graphtheoretic criteria are proposed for use within a general clustering paradigm as a means of developing procedures “in between” the extremes of complete-link and single-link hierarchical partitioning; these same ideas are then extended to include the more general problem of constructing subsets of objects with overlap. Finally, a number of related topics are surveyed within the general context of reinterpreting and justifying methods of clustering either through standard concepts in graph theory or their simple extensions.

Information

Type: Original Paper
Information: Psychometrika , Volume 39 , Issue 3 , September 1974 , pp. 283 - 309

DOI: https://doi.org/10.1007/BF02291704 [Opens in a new window]
Copyright: Copyright © 1974 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Abraham, C. T. Techniques for thesaurus organization and evaluation. In Kochen, M. (Eds.), Some problems in information science. New York: The Scarecrow Press. 1965, 131–150Google Scholar

Abraham, C. T. Graph theoretic techniques for the organization of linked data. In Kochen, M. (Eds.), Some problems in information science. New York: The Scarecrow Press. 1965, 229–264Google Scholar

Anderson, S. S.. Graph theory and finite combinatorics, 1970, Chicago: MarkhamGoogle Scholar

Augustson, J. C. and Minker, J. An analysis of some graph theoretical cluster techniques. Journal of the Association for Computing Machinery, 1970, 17, 571–588CrossRef Google Scholar

Augustson, J. C. and Minker, J. Deriving term relations for a corpus by graph theoretical clusters. Journal of the American Society for Information Science, 1970, 21, 101–111CrossRef Google Scholar

Bohisud, H. M. and Bohisud, L. E. A metric for classifications. Taxon, 1972, 21, 607–613CrossRef Google Scholar

Bonner, R. E. On some clustering techniques. IBM Journal, 1964, 8, 22–32CrossRef Google Scholar

Boorman, S. A. and Arabie, P. Structural measures and the method of sorting. In Shepard, R. N., Romney, A. K., Nerlove, S. B. (Eds.), Multidimensional scaling—Volume I. New York: Seminar Press. 1972, 225–249Google Scholar

Boorman, S. A. and Olivier, D. C.. Metrics on spaces of finite trees. Journal of Mathematical Psychology, 1973, 10, 26–59CrossRef Google Scholar

Busacker, R. G. and Saaty, T. L. Finite graphs and networks, 1965, New York: McGraw-HillGoogle Scholar

Cattell, R. B. and Coulter, M. A. Principles of behavioral taxonomy and the mathematical basis of the taxonome computer program. The British Journal of Mathematical and Statistical Psychology, 1966, 19, 237–269CrossRef Google Scholar PubMed

Chabot, J. A simplified example of the use of matrix multiplication for the analysis of sociometric data. Sociometry, 1950, 13, 131–140CrossRef Google Scholar

Clark, J. A. and McQuitty, L. L. Some problems and elaborations of interactive intercolumnar correlational analysis. Educational and Psychological Measurement, 1970, 30, 773–784CrossRef Google Scholar

Cole, A. J. and Wishart, D. An improved algorithm for the Jardine-Sibson method of generating overlapping clusters. The Computer Journal, 1970, 13, 156–163CrossRef Google Scholar

Constantinescu, P. The classification of a set of elements with respect to a set of properties. The Computer Journal, 1966, 8, 352–357CrossRef Google Scholar

Constantinescu, P.. A method of cluster analysis. The British Journal of Mathematical and Statistical Psychology, 1967, 20, 93–106CrossRef Google Scholar

Cormack, R. M.. A review of classification. Journal of the Royal Statistical Society—Series A, 1971, 134, 321–367CrossRef Google Scholar

Doreian, P. A note on the detection of cliques in valued graphs. Sociometry, 1969, 32, 237–242CrossRef Google Scholar

Erdös, P. and Rényi, A. On the evolution of random graphs. Publications of the Mathematical Institute of the Hungarian Academy of Sciences, 1960, 5, 17–61Google Scholar

Estabrook, G. F. A mathematical model in graph theory for biological classification. Journal of Theoretical Biology, 1966, 12, 297–310CrossRef Google Scholar PubMed

Festinger, L. The analysis of sociograms using matrix algebra. Human Relations, 1949, 2, 153–158CrossRef Google Scholar

Fillenbaum, S. and Rapoport, A. Structures in the subjective lexicon, 1971, New York: Academic PressGoogle Scholar

Ford, L. R. and Fulkerson, D. R. Flows in networks, 1962, Princeton: Princeton University PressGoogle Scholar

Gorinshteyn, L. L. The partitioning of graphs. Engineering Cybernetics, 1969, 1, 76–82Google Scholar

Gotlieb, C. C. and Kumar, S. Semantic clustering of index terms. Journal of the Association for Computing Machinery, 1968, 15, 493–513Google Scholar

Gower, J. C. Comparison of some methods of cluster analysis. Biometrics, 1967, 23, 623–637CrossRef Google Scholar PubMed

Gower, J. C. and Ross, G. J. S. Minimum spanning trees and single linkage cluster analysis. Applied Statistics, 1969, 18, 54–64CrossRef Google Scholar

Harary, F. A graph theoretic approach to similarity relations. Psychometrika, 1964, 29, 143–151CrossRef Google Scholar

Harary, F. Graph theory, 1969, Reading, Mass.: Addison-WesleyCrossRef Google Scholar

Harary, F. Graph theory as a structural model in the social sciences. In Harris, B. (Eds.), Graph theory and its applications. New York: Academic Press. 1970, 1–16Google Scholar

Harary, F., Norman, R. Z. and Cartwright, D. Structural models: An introduction to the theory of directed graphs, 1965, New York: WileyGoogle Scholar

Harary, F. and Ross, I. C. A procedure for clique detection using the group matrix. Sociometry, 1957, 20, 205–215CrossRef Google Scholar

Harrison, I. Cluster analysis. Metra, 1968, 7, 513–528Google Scholar

Hartigan, J. A. Representation of similarity matrices by trees. Journal of the American Statistical Association, 1967, 62, 1140–1158CrossRef Google Scholar

Hubert, L. Some extensions of Johnson's hierarchical clustering algorithms. Psychometrika, 1972, 37, 261–274CrossRef Google Scholar

Hubert, L. Monotone invariant clustering procedures. Psychometrika, 1973, 38, 47–62CrossRef Google Scholar

Hubert, L. Min and max hierarchical clustering using asymmetric similarity measures. Psychometrika, 1973, 38, 63–72CrossRef Google Scholar

Hubert, L. Approximate evaluation techniques for the single-link and complete-link hierarchical clustering procedures. Journal of the American Statistical Association, 1974, 69, in press. (a)CrossRef Google Scholar

Hubert, L. Spanning trees and aspects of clustering. British Journal of Mathematical and Statistical Psychology, 1974, in press. (b)CrossRef Google Scholar

Hubert, L. and Schultz, J. The approximate sampling distribution for the minimum number of lines in a connected random graph. Journal of Statistical Computation and Simulation, 1974, in press.Google Scholar

Jardine, N.. Towards a general theory of clustering. Biometrics, 1969, 25, 609–610Google Scholar

Jardine, N. Algorithms, methods and models in the simplification of complex data. The Computer Journal, 1970, 13, 116–117CrossRef Google Scholar

Jardine, N. A new approach to pattern recognition. Nature, 1971, 234, 526–528CrossRef Google Scholar

Jardine, N. and Sibson, R. A model for taxonomy. Mathematical Biosciences, 1968, 2, 465–482CrossRef Google Scholar

Jardine, N. and Sibson, R. The construction of hierarchic and non-hierarchic classifications. The Computer Journal, 1968, 11, 177–184CrossRef Google Scholar

Jardine, N. and Sibson, R. Mathematical taxonomy, 1971, New York: WileyGoogle Scholar

Johnson, S. C. Hierarchical clustering schemes. Psychometrika, 1957, 32, 241–254CrossRef Google Scholar

Kruskal, J. B. On the shortest spanning subtree of a graph and the traveling salesman problem. Proceedings of the American Mathematical Society, 1956, 7, 48–50CrossRef Google Scholar

Lance, G. N. and Williams, W. T. A general theory of classifactory sorting strategies I. Hierarchical systems. The Computer Journal, 1967, 10, 373–380CrossRef Google Scholar

Lance, G. N., Williams, W. T.. A general theory of classifactory sorting strategies II. Clustering systems. The Computer Journal, 1967, 10, 271–277CrossRef Google Scholar

Legendre, P., Rogers, D. J.. Characters and clustering in taxonomy: A synthesis of two taximetric procedures. Taxon, 1972, 21, 567–606CrossRef Google Scholar

Lerman, I. C.. Les bases de la classification automatique, 1970, Paris: Gauthier-VillarsGoogle Scholar

Levandowsky, M., Winter, D.. Distance between sets. Nature, 1971, 234, 34–35CrossRef Google Scholar

Ling, R. F.. On the theory and construction of k-clusters. The Computer Journal, 1972, 15, 326–332CrossRef Google Scholar

Ling, R. F.. A probability theory of cluster analysis. Journal of the American Statistical Association, 1973, 68, 159–164CrossRef Google Scholar

Luce, R. D.. Connectivity and generalized cliques in sociometric group structure. Psychometrika, 1950, 15, 169–190CrossRef Google Scholar PubMed

Luce, R. D.. Two decomposition theorems for a class of finite oriented graphs. American Journal of Mathematics, 1952, 74, 701–722CrossRef Google Scholar

Luce, R. D.. Networks satisfying minimality conditions. American Journal of Mathematics, 1953, 75, 825–838CrossRef Google Scholar

Luce, R. D., Perry, A. D.. A method of matrix analysis of group structure. Psychometrika, 1949, 14, 95–116CrossRef Google Scholar PubMed

Marshall, C. W.. Applied graph theory, 1971, New York: WileyGoogle Scholar

Matula, D. W.. Cluster analysis via graph theoretic techniques. In Mullin, R. C., Reid, K. B., Roselle, D. P. (Eds.), Proceedings of the Louisiana Conference on combinatorics, graph theory, and computing. Winnipeg: University of Manitoba. 1970, 199–212Google Scholar

Matula, D. W.. k-components, clusters and slicings in graphs. SIAM Journal of Applied Mathematics, 1972, 22, 459–480CrossRef Google Scholar

Matula, D. W., Marble, G., Isaacson, J. D.. Graph coloring algorithms. In Read, R. C. (Eds.), Graph theory and computing. New York: Academic Press. 1972, 109–122CrossRef Google Scholar

Menger, K.. Zur allgemeinen Kurventheorie. Fundamenta Mathematicae, 1927, 10, 96–115CrossRef Google Scholar

McQuitty, L. L.. Elementary linkage analysis for isolating orthogonal and oblique types and typal relevancies. Educational and Psychological Measurement, 1957, 17, 207–229CrossRef Google Scholar

McQuitty, L. L.. Typal analysis. Educational and Psychological Measurement, 1961, 21, 677–697CrossRef Google Scholar

McQuitty, L. L.. Elementary factor analysis. Psychological Reports, 1961, 9, 71–78CrossRef Google Scholar

McQuitty, L. L.. Rank order typal analysis. Educational and Psychological Measurement, 1963, 23, 55–61CrossRef Google Scholar

McQuitty, L. L.. Capabilities and improvements of linkage analysis as a clustering method. Educational and Psychological Measurement, 1964, 24, 441–456CrossRef Google Scholar

McQuitty, L. L.. A mutual development of some typological theories and pattern-analytic methods. Educational and Psychological Measurement, 1967, 27, 21–48CrossRef Google Scholar

McQuitty, L. L., Clark, J. A.. Clusters from iterative intercolumnar correlational analysis. Educational and Psychological Measurement, 1968, 28, 211–238CrossRef Google Scholar

Mulligan, G. C., Corneil, D. G.. Corrections to Bierstone's algorithm for generating cliques. Journal of the Association for Computing Machinery, 1972, 19, 244–247CrossRef Google Scholar

Needham, R. M.. The theory of Clumps II. Report Number 139, 1961, Cambridge, England: Cambridge Language Research UnitGoogle Scholar

Ogilvie, J. C.. The distribution of number and size of connected components in random graphs of medium size. In Morrell, A. J. H. (Eds.), Information processing: 68. Amsterdam: North Holland Publishing Co.. 1969, 1527–1530Google Scholar

Ore, O.. Theory of graphs, 1962, Providence: American Mathematical SocietyCrossRef Google Scholar

Ore, O.. Graphs and their use, 1963, New York: Random HouseGoogle Scholar

Overall, J. E.. A configural analysis of psychiatric diagnostic stereotypes. Behavioral Science, 1963, 8, 211–219CrossRef Google Scholar

Overall, J. E., Klett, C. J.. Applied multivariate analysis, 1972, New York: McGraw-HillGoogle Scholar

Parker-Rhodes, A. F.. Contributions to the theory of clumps: The usefulness and feasibility of the theory. Report Number 137, 1961, Cambridge, England: Cambridge Language Research UnitGoogle Scholar

Parker-Rhodes, A. F., Needham, R. M.. The theory of clumps. Report Number 126, 1961, Cambridge, England: Cambridge Language Research UnitGoogle Scholar

Peay, E. R.. An interactive clique detection procedure. Ann Arbor, Michigan: Michigan Mathematical Psychology Program. 1970, 70–74 (a)Google Scholar

Peay, E. R.. Nonmetric grouping: Clusters and cliques. Ann Arbor, Michigan: Michigan Mathematical Psychology Program. 1970, 70–75 (b)Google Scholar

Prim, R. C.. Shortest connection networks and some generalizations. Bell System Technical Journal, 1957, 36, 1389–1401CrossRef Google Scholar

Restle, F.. A metric and an ordering on sets. Psychometrika, 1959, 24, 207–219CrossRef Google Scholar

Rose, M. J.. Classification of a set of elements. The Computer Journal, 1964, 7, 208–210CrossRef Google Scholar

Ross, G. J. S.. Classification techniques for large sets of data. In Cole, A. J. (Eds.), Numerical taxonomy. New York: Academic Press. 1969, 224–233Google Scholar

Ross, I. C., Harary, F.. On the determination of redundancies in sociometric chains. Psychometrika, 1952, 17, 195–208CrossRef Google Scholar

Ross, I. C., Harary, F.. Identification of the liaison persons of an organization using the structure matrix. Management Science, 1955, 1, 251–258CrossRef Google Scholar

Ross, I. C., Harary, F.. A description of strenghtening and weakening group members. Sociometry, 1959, 22, 139–147CrossRef Google Scholar

Roy, D.. An algorithm for a general constrained set covering problem. In Read, R. C. (Eds.), Graph theory and computing. New York: Academic Press. 1972, 267–283CrossRef Google Scholar

Schultz, J., Hubert, L.. Data analysis and the connectivity of random graphs. Journal of Mathematical Psychology, 1973, 10, 421–428CrossRef Google Scholar

Shepard, R. N.. A taxonomy of some principal types of data and of multidimensional methods for their analysis. In Shepard, R. N., Romney, A. K., Nerlove, S. B. (Eds.), Multidimensional scaling-Volume I. New York: Seminar Press. 1972, 21–47Google Scholar

Shepherd, M. J., Willmott, A. J.. Cluster analysis on the Atlas computer. The Computer Journal, 1968, 11, 56–62CrossRef Google Scholar

Sibson, R.. Some observations on a paper by Lance and Williams. The Computer Journal, 1971, 14, 156–157CrossRef Google Scholar

Sparck-Jones, K.. Automatic keyword classification for information retrieval, 1971, London: ButterworthsGoogle Scholar

Tryon, R. C., Bailey, D. E.. The BCTRY computer system of cluster and factor analysis. Multivariate Behavioral Research, 1966, 1, 95–111CrossRef Google Scholar

Tutte, W. T.. The connectivity of graphs, 1967, Toronto: Toronto University PressGoogle Scholar

Van Rijsbergen, C. J.. A clustering algorithm. The Computer Journal, 1970, 13, 113–115Google Scholar

Vaswani, P. K. T.. A technique for cluster emphasis and its application to automatic indexing. In Morrell, A. J. H. (Eds.), Information processing: 68. Amsterdam: North Holland Publishing Co.. 1969, 1300–1303Google Scholar

Weiss, R. S., Jacobson, E.. A method for the analysis of the structure of complex organizations. American Sociological Review, 1955, 20, 661–668CrossRef Google Scholar

Whitney, H.. Congruent graphs and the connectivity of graphs. American Journal of Mathematics, 1932, 54, 150–168CrossRef Google Scholar

Williams, W. T., Lance, G. N., Dale, M. B., Clifford, H. T.. Controversy concerning the criteria for taxonometric strategies. The Computer Journal, 1971, 14, 162–165CrossRef Google Scholar

Wirth, M., Estabrook, G. F., Rogers, D. J.. A graph theory model for systematic biology, with an example for the Oncidiinae (Orchidaceae). Systematic Zoology, 1966, 15, 59–69CrossRef Google Scholar

Wishart, D.. A generalization of nearest neighbor which reduces chaining effects. In Cole, A. J. (Eds.), Numerical taxonomy. New York: Academic Press. 1969, 282–311Google Scholar

Zahn, C. T.. Graph-theoretical methods for detecting and describing Gestalt clusters. IEEE Transactions on Computers, 1971, C-20, 68–86CrossRef Google Scholar

Article contents

Some Applications of Graph Theory to Clustering

Abstract

Information

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests