A Monte Carlo Study of Thirty Internal Criterion Measures for Cluster Analysis

Glenn W. Milligan

doi:10.1007/BF02293899

A Monte Carlo Study of Thirty Internal Criterion Measures for Cluster Analysis

Published online by Cambridge University Press: 01 January 2025

Glenn W. Milligan

Show author details

Glenn W. Milligan*: Affiliation:
The Ohio State University
*: Requests for reprints should be sent to Glenn W. Milligan, Faculty of Management Sciences, 356 Hagerty Hall, The Ohio State University, Columbus, Ohio 43210.

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

A Monte Carlo evaluation of thirty internal criterion measures for cluster analysis was conducted. Artificial data sets were constructed with clusters which exhibited the properties of internal cohesion and external isolation. The data sets were analyzed by four hierarchical clustering methods. The resulting values of the internal criteria were compared with two external criterion indices which determined the degree of recovery of correct cluster structure by the algorithms. The results indicated that a subset of internal criterion measures could be identified which appear to be valid indices of correct cluster recovery. Indices from this subset could form the basis of a permutation test for the existence of cluster structure or a clustering algorithm.

Keywords

classification numerical taxonomy permutation tests

Type: Original Paper
Information: Psychometrika , Volume 46 , Issue 2 , June 1981 , pp. 187 - 199

DOI: https://doi.org/10.1007/BF02293899 [Opens in a new window]
Copyright: Copyright © 1981 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Reference Notes

Downton, M., & Brennan, T. Comparing classifications: An evaluation of several coefficients of partition agreement. Paper presented at the meeting of the Classification Society, Boulder, Colorado, June 1980.Google Scholar

Dudewicz, E. J. IRCCRAND-The Ohio State University random number generator package, 1974, Columbus, Ohio: The Ohio State University, Department of Statistics.Google Scholar

Edelbrock, C., & McLaughlin, B. Intraclass correlations as metrics for hierarchical cluster analysis: Parametric comparisons using the mixture model. Paper presented at the meeting of the Classification Society, Gainesville, Florida, April 1979.Google Scholar

Fowlkes, E. B., & Mallows, C. L. A new measure of similarity between two hierarchical clusterings and its use in studying hierarchical clustering methods, 1980, Colorado: Boulder.Google Scholar

Learmonth, G. P., & Lewis, P. A. W. Naval Postgraduate School random number generator package LLRANDOM, 1973, Monterey, Calif.: Naval Postgraduate School, Department of Operations Research and Administrative Sciences.Google Scholar

References

Arnold, S. J. A test for clusters. Journal of Marketing Research, 1979, 16, 545–551.CrossRef Google Scholar

Anderberg, M. R. Cluster analysis for applications, 1973, New York: Academic Press.Google Scholar

Baker, F. B., & Hubert, L. J. Measuring the power of hierarchical cluster analysis. Journal of the American Statistical Association, 1972, 70, 31–38.CrossRef Google Scholar

Blashfield, R. K. Mixture model tests of cluster analysis: Accuracy of four agglomerative hierarchical methods. Psychological Bulletin, 1976, 83, 377–388.CrossRef Google Scholar

Cormack, R. M. A review of classification. Journal of the Royal Statistical Society, Series A, 1971, 134, 321–367.CrossRef Google Scholar

Dudewicz, E. J. Speed and quality of random numbers for simulation. Journal of Quality Technology, 1976, 8, 171–178.CrossRef Google Scholar

Edelbrock, C. Comparing the accuracy of hierarchical clustering algorithms: The problem of classifying everybody. Multivariate Behavioral Research, 1979, 14, 367–384.CrossRef Google Scholar PubMed

Friedman, H. P., & Rubin, J. On some invariant criteria for grouping data. Journal of the American Statistical Association, 1967, 62, 1159–1178.CrossRef Google Scholar

Guilford, J. P., & Fruchter, B. Fundamental statistics in Psychology and Education, 1973, New York: McGraw-Hill.Google Scholar

Hartigan, J. A. Clustering algorithms, 1975, New York: Wiley.Google Scholar

Hubert, L. J., & Levin, J. R. A general statistical framework for assessing categorical clustering in free recall. Psychological Bulletin, 1976, 83, 1072–1080.CrossRef Google Scholar

Jardine, N., & Sibson, R. Mathematical taxonomy, 1971, New York: Wiley.Google Scholar

Johnson, S. C. Hierarchical clustering schemes. Psychometrika, 1967, 32, 241–254.CrossRef Google Scholar PubMed

Lingoes, J. C. & Cooper, T. PEP-I: A FORTRAN IV (G) program for Guttman-Lingoes nonmetric probability clustering. Behavioral Science, 1971, 16, 259–261.Google Scholar

McClain, J. O., & Rao, V. R. CLUSTISZ: A program to test for the quality of clustering of a set of objects. Journal of Marketing Research, 1975, 12, 456–460.Google Scholar

Milligan, G. W. An examination of the effect of six types of error perturbation on fifteen clustering algorithms. Psychometrika, 1980, 45, 325–342.CrossRef Google Scholar

Milligan, G. W., & Isaac, P. D. The validation of four ultrametric clustering algorithms. Pattern Recognition, 1980, 12, 41–50.CrossRef Google Scholar

Milligan, G. W., & Mahajan, V. A note on procedures for testing the quality of a clustering of a set of objects. Decision Sciences, 1980, 11, 669–677.CrossRef Google Scholar

Rand, W. M. Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 1971, 66, 846–850.CrossRef Google Scholar

Rohlf, F. J. Methods of comparing classifications. Annual Review of Ecology and Systematics, 1974, 5, 101–113.CrossRef Google Scholar

Rohlf, F. J., & Fisher, D. R. Tests for hierarchical structure in random data sets. Systematic Zoology, 1968, 17, 407–412.CrossRef Google Scholar

Sneath, P. H. A. Evaluation of clustering methods. In Cole, A. J. (Eds.), Numerical taxonomy, 1969, New York: Academic Press.Google Scholar

Sneath, P. H. A. Basic program for a significance test for clusters in UPGMA dendrograms obtained from squared euclidean distance. Computer Geosciences, 1979, 5, 127–137.CrossRef Google Scholar

Sneath, P. H. A. Basic program for a significance test for 2 clusters in euclidean space as measured by their overlap. Computer Geosciences, 1979, 5, 143–155.CrossRef Google Scholar

Williams, W. T., Clifford, H. T., & Lance, G. N. Group-size dependence: A rationale for choice between numerical classifications. Computer Journal, 1971, 14, 157–162.CrossRef Google Scholar

Article contents

A Monte Carlo Study of Thirty Internal Criterion Measures for Cluster Analysis

Abstract

Keywords

Access options

References

Reference Notes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests