Cluster Differences Scaling with a Within-Clusters Loss Component and a Fuzzy Successive Approximation Strategy to Avoid Local Minima

Willem J. Heiser; Patrick J. F. Groenen

doi:10.1007/BF02294781

Cluster Differences Scaling with a Within-Clusters Loss Component and a Fuzzy Successive Approximation Strategy to Avoid Local Minima

Published online by Cambridge University Press: 01 January 2025

Willem J. Heiser and

Patrick J. F. Groenen

Show author details

Willem J. Heiser*: Affiliation:
Department of Data Theory, Faculty of Social Sciences, Leiden University
Patrick J. F. Groenen: Affiliation:
Department of Data Theory, Faculty of Social Sciences, Leiden University
*: Requests for reprints should be addressed to Willem J. Heiser, Department of Data Theory, Faculty of Social Sciences, Leiden University, P.O. Box 9555, 2300 RB Leiden, THE NETHERLANDS.

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Cluster differences scaling is a method for partitioning a set of objects into classes and simultaneously finding a low-dimensional spatial representation of K cluster points, to model a given square table of dissimilarities among n stimuli or objects. The least squares loss function of cluster differences scaling, originally defined only on the residuals of pairs of objects that are allocated to different clusters, is extended with a loss component for pairs that are allocated to the same cluster. It is shown that this extension makes the method equivalent to multidimensional scaling with cluster constraints on the coordinates. A decomposition of the sum of squared dissimilarities into contributions from several sources of variation is described, including the appropriate degrees of freedom for each source. After developing a convergent algorithm for fitting the cluster differences model, it is argued that the individual objects and the cluster locations can be jointly displayed in a configuration obtained as a by-product of the optimization. Finally, the paper introduces a fuzzy version of the loss function, which can be used in a successive approximation strategy for avoiding local minima. A simulation study demonstrates that this strategy significantly outperforms two other well-known initialization strategies, and that it has a success rate of 92 out of 100 in attaining the global minimum.

Keywords

multidimensional scaling iterative majorization K-means clustering fuzzy clustering local minima constrained optimization analysis of dispersion co-citation analysis

Type: Original Paper
Information: Psychometrika , Volume 62 , Issue 1 , March 1997 , pp. 63 - 83

DOI: https://doi.org/10.1007/BF02294781 [Opens in a new window]
Copyright: Copyright © 1997 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

The authors are indebted to Robert Tijssen for making available the citation data, and to Jacqueline Meulman for her useful and stimulating comments during the completion of this manuscript, which is an extended version of the paper presented at the Annual Meeting of the Psychometric Society at Berkeley, CA, June 1993.

References

Anderson, E. (1935). The Irises of the Gaspe peninsula. Bulletin of the American Iris Society, 59, 2–5.Google Scholar

Ball, G. H., Hall, D. J. (1967). A clustering technique for summarizing multivariate data. Behavioral Science, 12, 153–155.CrossRef Google Scholar PubMed

Banfield, C. F., Bassill, L. C. (1977). Algorithm AS113. A transfer algorithm for non-hierarchical classification. Applied Statistics, 26, 206–210.CrossRef Google Scholar

Bezdek, J. C. (1973). Fuzzy mathematics in pattern classification, Ithaca: Cornell University.Google Scholar

Bezdek, J. C. (1981). Pattern recognition with fuzzy objective function algorithms, New York: Plenum.CrossRef Google Scholar

Bezdek, J. C., Dunn, J. C. (1975). Optimal fuzzy partitions: A heuristic for estimating the parameters in a mixture of normal distributions. IEEE Transactions on Computers, Series C, 24, 835–838.CrossRef Google Scholar

Bock, H.-H. (1979). Fuzzy clustering procedures. In Tomassone, R. (Eds.), Analyse des Donneés et Informatique [Data analysis and informatics] (pp. 205–218). Le Chesnay, France: INRIA.Google Scholar

Bock, H.-H. (1986). Multidimensional scaling in the framework of cluster analysis. In Degens, P. O., Hermes, H.-J., Opitz, O. (Eds.), Studien zur Klassifikation: Vol. 17 [Classification and its environment] (pp. 247–258). Frankfurt: INDEKS-Verlag.Google Scholar

Bock, H.-H. (1987). On the interface between cluster analysis, principal component analysis, and multidimensional scaling. In Bozdogan, H., Gupta, A. K. (Eds.), Multivariate statistical modeling and data analysis (pp. 17–34). New York: Reidel.CrossRef Google Scholar

de Leeuw, J. (1992). Fitting distances by least squares. Unpublished manuscript.Google Scholar

de Leeuw, J., Heiser, W. J. (1980). Multidimensional scaling with restrictions on the configuration. In Krishnaiah, P. R. (Eds.), Multivariate analysis, Vol. V (pp. 501–522). Amsterdam: North-Holland.Google Scholar

Draper, N. R., Smith, H. (1966). Applied regression analysis, New York: Wiley.Google Scholar

Dunn, J. C. (1974). A fuzzy relative of the ISODATA process and its use in detecting compact well separated clusters. Journal of Cybernetics, 3, 32–57.CrossRef Google Scholar

Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7, 179–188.CrossRef Google Scholar

Fisher, W. D. (1958). On grouping for maximum homogeneity. Journal of the American Statistical Association, 53, 789–798.CrossRef Google Scholar

Gordon, A. D. (1981). Classification: Methods for the explorator analysis of multivariate data, London: Chapman and Hall.Google Scholar

Gordon, A. D., Henderson, J. T. (1977). An algorithm for Euclidean sum of squares classification. Biometrics, 33, 355–362.CrossRef Google Scholar

Gower, J. C. (1989). Generalised canonical analysis. In Coppi, R., Bolasco, S. (Eds.), Multiway data analysis (pp. 221–232). Amsterdam: North-Holland.Google Scholar

Groenen, P. J. F. (1993). The majorization approach to multidimensional scaling: Some problems and extensions, Leiden: DSWO Press.Google Scholar

Guttman, L. (1968). A general nonmetric technique for finding the smallest coordinate space for a configuration of points. Psychometrika, 33, 469–506.CrossRef Google Scholar

Hartigan, J. A., Wong, M. A. (1979). Algorithm AS 136: AK-means clustering algorithm. Applied Statistics, 28, 100–108.CrossRef Google Scholar

Heiser, W. J. (1993). Clustering in low-dimensional space. In Opitz, O., Lausen, B., Klar, R. (Eds.), Information and classification: Concepts, methods and applications (pp. 162–173). Heidelberg: Springer Verlag.CrossRef Google Scholar

Heiser, W. J., & Groenen, P. J. F. (1993, June). Stress decomposition and use of fuzzy memberships in cluster differences scaling. Paper presented at the annual meeting of the Psychometric Society, Berkeley, California.Google Scholar

Kernighan, B. W., Lin, S. (1970). An efficient heuristic procedure for partitioning graphs. Bell Systems Technical Journal, 49, 291–307.CrossRef Google Scholar

Kruskal, J. B. (1977). The relationship between multidimensional scaling and clustering. In Van Ryzin, J. (Eds.), Classification and clustering (pp. 17–44). New York: Academic Press.CrossRef Google Scholar

Rao, C. R. (1955). Analysis of dispersion for multiple classified data with unequal numbers in cells. Sankhyā, 15, 253–280.Google Scholar

Rao, C. R. (1973). Linear statistical inference and its applications 2nd ed.,, New York: Wiley.CrossRef Google Scholar

Ruspini, E. (1970). Numerical methods for fuzzy clustering. Information Science, 2, 319–350.CrossRef Google Scholar

Selim, S. Z., Ismail, M. A. (1984). K-means-type algorithms: A generalized convergence theorem and characterization of local optimality. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-6, 81–87.CrossRef Google Scholar PubMed

Shepard, R. N. (1962). The analysis of proximities: multidimensional scaling with an unknown distance function I & II. Psychometrika, 27, 125–140.CrossRef Google Scholar

Sokal, R. R., Michener, C. D. (1958). A statistical method for evaluating systematic relationships. The University of Kansas Science Bulletin, 38, 1409–1438.Google Scholar

Späth, H. (1985). Cluster dissection and analysis, Chichester: Ellis Horwood.Google Scholar

Tijssen, R. J. W. (1992). Cartography of science: Scientometric mapping with multidimensional scaling methods, Leiden: DSWO Press.Google Scholar

Tobler, W. (1976). Spatial interaction patterns. Journal of Environmental Systems, 6, 271–301.CrossRef Google Scholar

Wedel, M. (1990). Clusterwise regression and market segmentation: Developments and applications. Unpublished doctoral dissertation, University of Wageningen.Google Scholar

Zadeh, L. A. (1977). Fuzzy sets and their application to pattern classification and clustering analysis. In Van Ryzin, J. (Eds.), Classification and clustering (pp. 251–299). New York: Academic Press.CrossRef Google Scholar

Article contents

Cluster Differences Scaling with a Within-Clusters Loss Component and a Fuzzy Successive Approximation Strategy to Avoid Local Minima

Abstract

Keywords

Access options

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests