Cluster Correspondence Analysis

M. van de Velden; A. Iodice D’Enza; F. Palumbo

doi:10.1007/s11336-016-9514-0

Cluster Correspondence Analysis

Published online by Cambridge University Press: 01 January 2025

M. van de Velden ,

A. Iodice D’Enza and

F. Palumbo

Show author details

M. van de Velden*: Affiliation:
Erasmus University Rotterdam
A. Iodice D’Enza: Affiliation:
Università di Cassino e del Lazio Meridionale
F. Palumbo: Affiliation:
Università degli Studi di Napoli Federico II
*: Correspondence should be made to M. van de Velden, Econometric Institute, Erasmus University Rotterdam, P.O. Box 1738, 3000 DR Rotterdam, The Netherlands. Email: vandevelden@ese.eur.nl

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

A method is proposed that combines dimension reduction and cluster analysis for categorical data by simultaneously assigning individuals to clusters and optimal scaling values to categories in such a way that a single between variance maximization objective is achieved. In a unified framework, a brief review of alternative methods is provided and we show that the proposed method is equivalent to GROUPALS applied to categorical data. Performance of the methods is appraised by means of a simulation study. The results of the joint dimension reduction and clustering methods are compared with the so-called tandem approach, a sequential analysis of dimension reduction followed by cluster analysis. The tandem approach is conjectured to perform worse when variables are added that are unrelated to the cluster structure. Our simulation study confirms this conjecture. Moreover, the results of the simulation study indicate that the proposed method also consistently outperforms alternative joint dimension reduction and clustering methods.

Keywords

correspondence analysis cluster analysis dimension reduction categorical data

Type: Original Paper
Information: Psychometrika , Volume 82 , Issue 1 , March 2017 , pp. 158 - 185

DOI: https://doi.org/10.1007/s11336-016-9514-0 [Opens in a new window]
Copyright: Copyright © 2016 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Electronic supplementary material The online version of this article (doi:10.1007/s11336-016-9514-0) contains supplementary material, which is available to authorized users.

References

Bäck, T. (1996). Evolutionary algorithms in theory and practice: Evolution strategies, evolutionary programming, genetic algorithms, Oxford: Oxford University Press.CrossRef Google Scholar

Borg, I., & Groenen, P. J. (2005). Modern multidimensional scaling: Theory and applications, New York: Springer.Google Scholar

De Soete, G., Carroll, J. D., Diday, E., Lechevallier, Y., Schader, M., Bertrand, P., & Burtschy, B. (1994). K-means clustering in a low-dimensional euclidean space. New approaches in classification and data analysis, Berlin: Springer 212–219.CrossRef Google Scholar

Gifi, A. (1990). Nonlinear multivariate analysis, Chichester: Wiley.Google Scholar

Gower, J. C. (1971). A general coefficient of similarity and some of its properties. Biometrics, 27, 623–637.CrossRef Google Scholar

Gower, J. C., Lubbe, S. G., & Le Roux, N. J. (2011). Understanding biplots, New York: Wiley.CrossRef Google Scholar

Gower, J. C., Groenen, PJF, & van de Velden, M. (2010). Area biplots. Journal of Computational and Graphical Statistics, 19, (1), 46–61.CrossRef Google Scholar

Gower, J. C., & Hand, D. J. (1996). Biplots, London: Chapman and Hall.Google Scholar

Greenacre, M. J. (1984). Theory and applications of correspondence analysis, London: Academic Press.Google Scholar

Greenacre, M. J. (1993). Biplots in correspondence analysis. Journal of Applied Statistics, 20, (2), 251–269.CrossRef Google Scholar

Greenacre, M. J. (2007). Correspondence analysis in practice, Boca Raton: CRC Press.CrossRef Google Scholar

Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218. doi:10.1007/BF01908075CrossRef Google Scholar

Hwang, H., Dillon, W. R., & Takane, Y. (2006). An extension of multiple correspondence analysis for identifying heterogenous subgroups of respondents. Psychometrika, 71, 161–171.CrossRef Google Scholar

Iodice D’Enza, A., & Palumbo, F. (2013). Iterative factor clustering of binary data. Computational Statistics, 789-807. doi:10.1007/s00180-012-0329-xCrossRef Google Scholar

Iodice D’Enza, A., van de Velden, M., Palumbo, F., Vicari, D., Okada, A., Ragozini, G., & Weihs, C. (2014). On joint dimension reduction and clustering of categorical data. Analysis and modeling of complex data in behavioral and social sciences, Berlin: Springer.Google Scholar

Jolliffe, J. (2002). Principal component analysis, New York: Springer.Google Scholar

Kroonenberg, P. M., & Lombardo, R. (1999). Nonsymmetric correspondence analysis: A tool for analysing contingency tables with a dependence structure. Multivariate Behavioral Research, 34, 367–396.CrossRef Google Scholar

Lauro, N., & D’Ambra, L. (1984). L’ analyse non symetrique des correspondances [nonsymmetric correspondence analysis]. In E. Diday, L. Lebart, M. Jambu, & Thomassone (Eds.), Data analysis and informatics III (pp. 433–446). Amsterdam: Elsevier.Google Scholar

MacQueen, J., Cam, L., & Neyman, J. (1967). Some methods for classification and analysis of multivariate observations. Proceedings of the fifth berkeley symposium on mathematical statistics and probability, California: University of California Press 281–297.Google Scholar

Martin, R. A., Puhlik-Doris, P., Larsen, G., Gray, J., & Weir, K. (2003). Individual differences in uses of humor and their relation to psychological well-being: Development of the humor styles questionnaire. Journal of Research in Personality, 37, (1), 48–75.CrossRef Google Scholar

Nishisato, S. (1980). Analysis of categorical data: Dual scaling and its applications, Toronto: University of Toronto Press.CrossRef Google Scholar

Nishisato, S. (1994). Elements of dual scaling: An introduction to practical data analysis, Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar

van de Velden, M., & Bijmolt, T. (2006). Generalized canonical correlation analysis of matrices with missing rows: A simulation study. Psychometrika, 71, (2), 323–331.CrossRef Google Scholar

van de Velden, M., & Takane, Y. (2012). Generalized canonical correlation analysis with missing values. Computational Statistics, 27, (3), 551–571.CrossRef Google Scholar

Van Buuren, S., & Heiser, W. (1989). Clustering n objects into k groups under optimal scaling of variables. Psychometrika, 54, 699–706.CrossRef Google Scholar

Vichi, M., & Kiers, HAL (2001). Factorial k-means analysis for two-way data. Computational Statistics and Data Analysis, 37, 49–64.CrossRef Google Scholar

Vichi, M., Vicari, D., & Kiers, H. (2009). Clustering and dimensional reduction for mixed variables. (Unpublished manuscript)Google Scholar

Yamamoto, M., & Hwang, H. (2014). A general formulation of cluster analysis with dimension reduction and subspace separation. Behaviormetrika, 41, 115–129.CrossRef Google Scholar

van de Velden et al. supplementary material

File 1.2 MB

Article contents

Cluster Correspondence Analysis

Abstract

Keywords

Access options

Footnotes

References

van de Velden et al. supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests