Hostname: page-component-5f745c7db-j9pcf Total loading time: 0 Render date: 2025-01-06T06:24:03.977Z Has data issue: true hasContentIssue false

An Analysis and Synthesis of Multiple Correspondence Analysis, Optimal Scaling, Dual Scaling, Homogeneity Analysis and Other Methods for Quantifying Categorical Multivariate Data

Published online by Cambridge University Press:  01 January 2025

Michel Tenenhaus*
Affiliation:
Centre d'Enseignement Supérieur des Affaires, Jouy-en-Josas, France
Forrest W. Young
Affiliation:
University of North Carolina at Chapel Hill
*
Requests for reprints should be sent to Michel Tenenhaus, Department S.I.A.D., Centre d'Enseignement Superieur des Affaires, 78350 Jouy-en-Josas, Paris, France.

Abstract

We discuss a variety of methods for quantifying categorical multivariate data. These methods have been proposed in many different countries, by many different authors, under many different names. In the first major section of the paper we analyze the many different methods and show that they all lead to the same equations for analyzing the same data. In the second major section of the paper we introduce the notion of a duality diagram, and use this diagram to synthesize the many superficially different methods into a single method.

Type
Original Paper
Copyright
Copyright © 1985 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

The ideas in this paper were worked out by the first author, with some suggestions provided by the second. The current version of this paper has evolved from three previous versions, the first two written by the first author.

References

Baker, F. B. (1960). Univac scientific computer program for scaling of psychological inventories by the method of reciprocal averages CPA 22. Behavioral Science, 5, 268269.Google Scholar
Benzécri, J. P. (1973). L'analyse des données: T. 2, l'analyse des correspondances [Data Analysis: T. 2, Correspondence analysis], Paris: Dunod.Google Scholar
Benzécri, J. P. (1977). Historie et préhistoire de l'analyse des données: l'analyse des correspondances [History and prehistory of data analysis: correspondence analysis]. Les Cahiers de l'Analyse des Données, 2, 953.Google Scholar
Benzécri, J. P. (1977). Sur l'analyse des tableaux binaires associés à une correspondance multiple [The analysis of boolean tables associated with a multiple correspondence]. Les Cahiers de l'Analyse des Données, 2, 5571.Google Scholar
Bock, R. D. (1960). Methods and applications of optimal scaling (Rep. No. 25), Chapel Hill: University of North Carolina.Google Scholar
Bouroche, J. M., Saporta, G., & Tenenhaus, M. (1975 August). Generalized canonical analysis of qualitative data, San Diego: University of California.Google Scholar
Burt, C. (1950). The factorial analysis of qualitative data. British Journal of Psychology, 3, 166185.Google Scholar
Burt, C. (1953). Scale analysis and factor analysis. British Journal of Statistical Psychology, 6, 523.CrossRefGoogle Scholar
Cailliez, F., & Pagès, J. P. (1976). Introduction à l'analyse des données [Introduction to data analysis], Paris: Smash.Google Scholar
Carroll, J. D. (1968). Generalization of canonical correlation analysis to three or more sets of variables. Proceedings of the 76th Annual Convention of the American Psychological Association, 3, 227228.Google Scholar
Cazes, P. (1972). Etude du dédoublement d'un tableau en analyse des correspondances [Analysis of a table and its complementary in correspondence analysis], Paris: Université Pierre et Marie Curie, Laboratoire de Statistique Mathématique.Google Scholar
Cazes, P., Baumerder, A., Bonnefous, S., & Pagès, J. P. (1977). Codage et analyse des tableaux logiques. Introduction à la pratique des variables qualitatives [Scaling and analysis of binary tables. Introduction to the practice of qualitative variables]. Cahiers du Bureau Universitaire de Recherche Opérationnelle, Paris: Institut de Statistique des Universités de Paris, Université Pierre et Marie Curie.Google Scholar
Daudin, J. J., & Trecourt, P. (1980). Analyse factorielle des correspondances et modèle log-linéaire. Comparaison des deux méthodes sur un exemple [Correspondence analysis and Log-linear model. Comparison of both models on an example]. Revue de Statistique Appliquée, 28, 524.Google Scholar
de Leeuw, J. (1973). Canonical analysis of categorical data, Leiden, The Netherlands: University of Leiden.Google Scholar
Dempster, A. P. (1969). Elements of continuous multivariate analysis, Reading, MA: Addison-Wesley.Google Scholar
Eckart, C., & Young, C. (1936). The approximation of one matrix by another of lower rank. Psychometrika, 1, 211218.CrossRefGoogle Scholar
Escofier, B. (1979). Une représentation des variables dans l'analyse des correspondances multiples [Representation of variables in multiple correspondence analysis]. Revue de Statistique Appliquée, 27, 3747.Google Scholar
Escofier, B. (1979). Traitement simultané de variables qualitatives et quantitatives en analyse factorielle [Simultaneous treatment of qualitative and quantitative variables in factor analysis]. Les Cahiers de l'Analyse des Données, 4, 137146.Google Scholar
Fisher, R. A. (1940). The precision of discriminant functions. Annals of Eugenics, 10, 422429.CrossRefGoogle Scholar
Gifi, A. (1981). Nonlinear multivariate analysis, Leiden, The Netherlands: University of Leiden, Afdeling Datatheorie.Google Scholar
Greenacre, M. J. (1984). Theory and applications of correspondence analysis, London: Academic Press.Google Scholar
Guttman, L. (1941). The quantification of a class of attributes: A theory and method of scale construction. In Horst, P.et al. (Eds.), The prediction of personal adjustment (pp. 319348). New York: Social Science Research Council.Google Scholar
Guttman, L. (1950). The principal components of scale analysis. In Stouffer, S. A., Guttman, L., Suchman, E. A., Lazarsfeld, P. F., Star, S. A. & Clausen, J. A. (Eds.), Measurement and prediction, Princeton: Princeton University Press.Google Scholar
Guttman, L. (1953). A note on Sir Cyril Burt's factorial analysis of qualitative data. British Journal of Statistical Psychology, 6, 14.CrossRefGoogle Scholar
Guttman, L. (1959). Metricizing rank-ordered or unordered data for a linear factor analysis. Sankhya, 21, 257268.Google Scholar
Hayashi, C. (1950). On the quantification of qualitative data from the mathematico-statistical point of view. Annals of the Institute of Statistical Mathematics, 2(1), 3547.CrossRefGoogle Scholar
Hayashi, C. (1952). On the prediction of phenomena from qualitative data and the quantification of qualitative data from the mathematico-statistical point of view. Annals of the Institute of Statistical Mathematics, 3(2), 6998.CrossRefGoogle Scholar
Hayashi, C. (1954). Multidimensional quantification—with applications to analysis of social phenomena. Annals of the Institute of Statistical Mathematics, 5(2), 121143.CrossRefGoogle Scholar
Healy, M. J. R., Goldstein, H. (1976). An approach to the scaling of categorized attributes. Biometrika, 63, 219229.CrossRefGoogle Scholar
Hill, M. O. (1973). Reciprocal averaging: An eigenvector method of ordination. Journal of Ecology, 61, 237251.CrossRefGoogle Scholar
Hill, M. O., & Smith, J. E. (1976). Principal component analysis of taxonomic data with multi-state discrete characters. Taxonomy, 25, 249255.CrossRefGoogle Scholar
Hirshfield, H. O. (1935). A connection between correlation and contingency. Cambridge Philosophical Society Proceedings, 31, 520524.CrossRefGoogle Scholar
Horst, P. (1935). Measuring complex attitudes. Journal of Social Psychology, 6, 369374.CrossRefGoogle Scholar
Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24, 417441.CrossRefGoogle Scholar
Kettenring, J. R. (1971). canonical analysis of several sets of variables. Biometrika, 58, 433451.CrossRefGoogle Scholar
Lauro, N. C., & Decarli, A. (1982). Correspondence analysis and log-linear models in multiway contingency tables study: Some remarks on experimental data. Metron, 40, 213234.Google Scholar
Lebart, L. (1975). L'orientation du dépouillement de certaines enquêtes par l'analyse des correspondances multiples [The orientation of the analysis of some surveys by multiple correspondence analysis]. Consommation, 2, 7396.Google Scholar
Lebart, L., & Fénelon, J. P. (1971). Statistique et informatique appliquées [Applied statistics and informatics], Paris: Dunod.Google Scholar
Lebart, L., Morineau, A., & Tabard, N. (1977). Techniques de la description statistique [Statistical description technics], Paris: Dunod.Google Scholar
Lebart, L., Morineau, A., & Warwick, K. M. (1984). Multivariate descriptive analysis: Correspondence analysis and related techniques for large matrics, New York: Wiley-Interscience.Google Scholar
Leclerc, A. (1980). Quelques propriétés optimales en analyse de données en terme de corrélation entre variables [Some optimal properties in data analysis in term of correlation between variables]. Mathématique et Sciences Humaines, 18, 5167.Google Scholar
Levine, J. H. (1979). Joint space analysis of “pick any” data: Analysis of choices from an unconstrained set of alternatives. Psychometrika, 44, 8592.CrossRefGoogle Scholar
Lingoes, J. C. (1963). Multivariate analysis of contingencies: An IBM 7090 program for analyzing metric/non-metric or linear/non-linear data, [Computer program]. Ann Arbor, MI: University of Michigan Computing Center.Google Scholar
Lingoes, J. C. (1964). Simultaneous linear regression: An IBM 7090 program for analyzing metric/non-metric or linear/non-linear data. Behavioral Science, 9, 8788.Google Scholar
Lingoes, J. C. (1968). The multivariate analysis of qualitative data. Multivariate Behavioral Research, 3, 6194.CrossRefGoogle Scholar
Lingoes, J. C. (1972). A general survey of the Guttman-Lingoes nonmetric program series. In Shepard, R. N., Romney, A. K., & Nerlove, S. (Eds.), Multidimensional scaling: Theory and applications in the behavioral sciences, Vol. 1: Theory (pp. 4968). New York: Seminar Press.Google Scholar
Lingoes, J. C. (1973). The Guttman-Lingoes nonmetric program series, Ann Arbor: Mathesis Press.Google Scholar
Lingoes, J. C. (1977). Geometric representations of relational data: Readings in multidimensional scaling, Ann Arbor: Mathesis Press.Google Scholar
Mardia, K. V., Kent, J. T., & Bibby, J. M. (1979). Multivariate Analysis, London: Academic Press.Google Scholar
Masson, M. (1974). Processus linéaires et analyse des données non linéaires [Linear processes and non-linear data analysis.], Paris: Université Pierre et Marie Curie.Google Scholar
Masson, M. (1980). Méthodologies générales de traitement statistique de l'information de masse, [General methodologies for the statistical treatment of large information]. Paris: Cedic-Fernand Nathan.Google Scholar
McDonald, R. P. (1968). A unified treatment of the weighting problem. Psychometrika, 33, 351381.CrossRefGoogle ScholarPubMed
McKeon, J. J. (1966). Canonical analysis: Some relations between canonical correlation, factor analysis, discriminant function analysis and scaling theory. [Monograph No. 13]. Psychometrika.Google Scholar
Mosier, C. I. (1946). Machine methods in scaling by reciprocal averages (pp. 3539). New York: IBM Corporation.Google Scholar
Mosteller, F. (1949). A theory of scalogram analysis using noncumulative types of items. (Report No. 9), Cambridge: Harvard University, Laboratory of Social Relations.Google Scholar
Nishisato, S. (1972). Optimal scaling and its generalizations. I (Methods. Measurement and evaluation of Categorical Data Technical Report No. 1), Toronto: Department of Measurement and Evaluation, the Ontario Institute for Studies in Education.Google Scholar
Nishisato, S. (1973). Optimal scaling and its generalizations. II (Applications. Measurement and Evaluation of Categorical Data Technical Report No. 2), Toronto: Department of Measurement and Evaluation, the Ontario Institute for Studies in Education.Google Scholar
Nishisato, S. (1976). Optimal scaling as applied to different forms of data (Measurement and Evaluation of Categorical Data Technical Report No. 4), Toronto: Department of Measurement and Evaluation, the Ontario Institute for Studies in Education.Google Scholar
Nishisato, S. (1978). Multidimensional Scaling: A historical sketch and bibliography (Tech. Rep.), Toronto: Department of Measurement, Evaluation and Computer Applications, the Ontario Institute for Studies in Education.Google Scholar
Nishisato, S. (1979). Dual Scaling and its variants. New Directions for Testing and Measurement, 4, 112.Google Scholar
Nishisato, S. (1980). Analysis of categorical data: Dual Scaling and its applications, Toronto: University of Toronto Press.CrossRefGoogle Scholar
Nishisato, S. (1982). Shitsuteki Data no Suryoka: Sotsui Shakudoho to sono Oyo, Tokyo: Asakura Shoten.Google Scholar
Nishisato, S., & Inukai, Y. (1972). Partially optimal scaling of items with ordered categories. Japanese Psychological Research, 14, 109119.CrossRefGoogle Scholar
Nishisato, S., & Leong, K. S. (1975). OPSCAL: A FORTRAN IV Program for analysis of qualitative data by optimal scaling, Toronto: Department of Measurement and Evaluation, The Ontario Institute for Studies in Education.Google Scholar
Nishisato, S., & Sheu, W. J. (1980). Piecewise method of reciprocal averages for dual scaling of multiple-choice data. Psychometrika, 45, 467478.CrossRefGoogle Scholar
Rao, C. R. (1964). The use and interpretation of principal component analysis in applied research. Sankhya, 26, 329358.Google Scholar
Richardson, M., & Kuder, G. F. (1933). Making a rating scale that measures. Personnel Journal, 12, 3640.Google Scholar
Saito, T. (1973). Quantification of categorical data by using the generalized variance. Soken Kiyo, Nippon UNIVAC Sogo Kenkyu-sho, Inc., 6180.Google Scholar
Saporta, G. (1975). Liaison entre plusieurs ensembles de variables et codage de données qualitatives [Relationship between several sets of variables and scaling of qualitative data], Paris: Université Pierre et Marie Curie.Google Scholar
Saporta, G. (1980). About some remarkable properties of generalized canonical analysis, The Netherlands: Groningen.Google Scholar
Shiba, S. (1965). A method for scoring multicategory items. Japanese Psychological Research, 7, 7579.CrossRefGoogle Scholar
Tenenhaus, M. (1977). Analyse en composantes principales d'un ensemble de variables nominales ou numériques [Principal component analysis of a set of nominal or numerical variables]. Revue de Statistique Appliquée, 25, 3956.Google Scholar
Torgerson, W. S. (1958). Theory and methods of scaling, New York: Wiley.Google Scholar
Van Rijckevorsel, J., & de Leeuw, J. (1978). An outline to HOMALS-1, Leiden: University of Leiden.Google Scholar
Van Rijckevorsel, J., & de Leeuw, J. (1979). An outline to PRINCALS, Leiden: University of Leiden.Google Scholar