Hostname: page-component-745bb68f8f-mzp66 Total loading time: 0 Render date: 2025-01-08T15:54:50.415Z Has data issue: false hasContentIssue false

Partial Identification of Latent Correlations with Binary Data

Published online by Cambridge University Press:  01 January 2025

Steffen Grønneberg*
Affiliation:
BI Norwegian Business School
Jonas Moss
Affiliation:
University of Oslo
Njål Foldnes
Affiliation:
BI Norwegian Business School
*
Correspondence should be made to Steffen Grønneberg, Department of Economics, BI Norwegian Business School, Oslo 0484, Norway. Email: steffeng@gmail.com

Abstract

The tetrachoric correlation is a popular measure of association for binary data and estimates the correlation of an underlying normal latent vector. However, when the underlying vector is not normal, the tetrachoric correlation will be different from the underlying correlation. Since assuming underlying normality is often done on pragmatic and not substantial grounds, the estimated tetrachoric correlation may therefore be quite different from the true underlying correlation that is modeled in structural equation modeling. This motivates studying the range of latent correlations that are compatible with given binary data, when the distribution of the latent vector is partly or completely unknown. We show that nothing can be said about the latent correlations unless we know more than what can be derived from the data. We identify an interval constituting all latent correlations compatible with observed data when the marginals of the latent variables are known. Also, we quantify how partial knowledge of the dependence structure of the latent variables affect the range of compatible latent correlations. Implications for tests of underlying normality are briefly discussed.

Type
Theory and Methods
Copyright
Copyright © 2020 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Electronic Supplementary Information The online version supplementary material available at https://doi.org/10.1007/s11336-020-09737-y.

References

Almeida, C., & Mouchart, M.(2014). Testing normality of latent variables in the polychoric correlation. Statistica, 74(1), 325.Google Scholar
Asparouhov, T., & Muthén, B.(2016). Structural equation models and mixture models with continuous nonnormal skewed distributions. Structural Equation Modeling, 23(1), 119.CrossRefGoogle Scholar
Asquith, W. H. (2020). copBasic|General bivariate copula theory and many utility functions [Computer software manual]. Retrieved from https://CRAN.R-project.org/package=copBasic.Google Scholar
Azzalini, A. The skew-normal and related families, (2013). Cambridge:Cambridge University Press.CrossRefGoogle Scholar
Bernard, C., Jiang, X., & Vanduffel, S.(2012). A note on ‘Improved Fréchet bounds and model-free pricing of multi-asset options’ by Tankov (2011). Journal of Applied Probability, 49(3), 866875.CrossRefGoogle Scholar
Bollen, K. A. Structural equations with latent variables, (2014). New Jersey:Wiley.Google Scholar
Christoffersson, A.(1975). Factor analysis of dichotomized variables. Psychometrika, 40(1), 532.CrossRefGoogle Scholar
Claeskens, G., & Hjort, N. L. Model selection and model averaging, (2008). Cambridge:Cambridge University Press.Google Scholar
Foldnes, N., & Grønneberg, S.(2019). On identification and non-normal simulation in ordinal covariance and item response models. Psychometrika, 84(4), 10001017.CrossRefGoogle ScholarPubMed
Foldnes, N., & Grønneberg, S.(2019). Pernicious polychorics: The impact and detection of underlying non-normality. Structural Equation Modeling, 27(4), 525543.CrossRefGoogle Scholar
Foldnes, N., & Grønneberg, S. (2020). The sensitivity of structural equation modeling with ordinal data to underlying non-normality and observed distributional forms. Psychological Methods. (Forthcoming).Google Scholar
Fréchet, M. (1958). Remarques de M. Fréchet au sujet de la note précédente. Comptes rendus hebdomadaires des séances de l’Académie des sciences(2), 2719–2720. Retrieved from https://gallica.bnf.fr/ark:/12148/bpt6k723q/f661.image.Google Scholar
Fréchet, M.(1960). Sur les tableaux de corrélation dont les marges sont données. Revue de l’Institut International de Statistique, 28(1/2), 1032.CrossRefGoogle Scholar
Grønneberg, S., & Foldnes, N.(2017). Covariance model simulation using regular vines. Psychometrika, 82(4), 10351051.CrossRefGoogle ScholarPubMed
Höffding, W. (1940). Maßstabinvariante korrelationstheorie für diskontinuierliche verteilungen (Unpublished doctoral dissertation). Universität Berlin.Google Scholar
Joe, H. Multivariate models and multivariate dependence concepts, (1997). Boca Raton:CRC Press.Google Scholar
Jöreskog, K. G. (1994). Structural equation modeling with ordinal variables. In Multivariate analysis and its applications (pp. 297–310). Institute of Mathematical Statistics. https://doi.org/10.1214/lnms/1215463803.CrossRefGoogle Scholar
Jöreskog, K. G., & Sörbom, D. LISREL 8: User’s reference guide, (1996). Illinois:Scientific Software International.Google Scholar
Kallenberg, O. Foundations of modern probability, (2006). 2Berlin:Springer Science.Google Scholar
Kolenikov, S., & Angeles, G.(2009). Socioeconomic status measurement with discrete proxy variables: Is principal component analysis a reliable answer?. Review of Income and Wealth, 55(1), 128165.CrossRefGoogle Scholar
Lehmann, E. L.(1966). Some concepts of dependence. The Annals of Mathematical Statistics, 37(5), 11371153.CrossRefGoogle Scholar
Manski, C. F. Partial identification of probability distributions, (2003). Berlin:Springer Science.Google Scholar
Maydeu-Olivares, A.(2006). Limited information estimation and testing of discretized multivariate normal structural models. Psychometrika, 71(1), 5777.CrossRefGoogle Scholar
Molenaar, D., & Dolan, C. V.(2018). Nonnormality in latent trait modelling. Irwing, P., Booth, T., & Hughes, D. J. The wiley handbook of psychometric testing,New Jersey:Wiley Online Library 347373.CrossRefGoogle Scholar
Muthén, B.(1978). Contributions to factor analysis of dichotomous variables. Psychometrika, 43(4), 551560.CrossRefGoogle Scholar
Muthén, B.(1984). A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators. Psychometrika, 49(1), 115132.CrossRefGoogle Scholar
Muthén, B., Hofacker, C.(1988). Testing the assumptions underlying tetrachoric correlations. Psychometrika, 53(4), 563577.CrossRefGoogle Scholar
Muthén, L. K., & Muthén, B. O. (1998–2017). Mplus user’s guide (8th ed., pp. 204–215). Los Angeles, CA: Muthén & Muthén.Google Scholar
Narasimhan, B., Johnson, S. G., Hahn, T., Bouvier, A., & Kiêu, K. (2020). cubature: Adaptive multivariate integration over hypercubes [Computer software manual]. Retrieved from https://CRAN.R-project.org/package=cubature.Google Scholar
Nelsen, R. B. An introduction to copulas, (2007). Berlin:Springer Science.Google Scholar
Olsson, U.(1979). Maximum likelihood estimation of the polychoric correlation coefficient. Psychometrika, 44(4), 443460.CrossRefGoogle Scholar
Owen, D. B.(1980). A table of normal integrals. Communications in Statistics - Simulation and Computation, 9(4), 389419.CrossRefGoogle Scholar
Pearl, J. Causality, (2009). Cambridge:Cambridge University Press.CrossRefGoogle Scholar
Pearson, K.(1900). I. Mathematical contributions to the theory of evolution.|VII. on the correlation of characters not quantitatively measurable. Philosophical Transactions of the Royal Society of London. Series A, 195,147.Google Scholar
Pearson, K.(1909). On a new method of determining correlation between a measured character a, and a character b, of which only the percentage of cases wherein B exceeds (or falls short of) a given intensity is recorded for each grade of a. Biometrika, 7(1/2), 96105.CrossRefGoogle Scholar
Pearson, K., & Heron, D.(1913). On theories of association. Biometrika, 9(1/2), 159315.CrossRefGoogle Scholar
Pearson, K., & Pearson, E. S.(1922). On polychoric coefficients of correlation. Biometrika, 14(1/2), 127156.CrossRefGoogle Scholar
R Core Team. (2020). R: A language and environment for statistical computing [Computer software manual].Google Scholar
Rosseel, Y.(2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 136.CrossRefGoogle Scholar
Satorra, A., & Bentler, P. (1988). Scaling corrections for statistics in covariance structure analysis (Tech. Rep.). Retrieved from https://escholarship.org/content/qt3141h70c/qt3141h70c.pdf.Google Scholar
Shapiro, A.(1983). Asymptotic distribution theory in the analysis of covariance structures. South African Statistical Journal, 17(1), 3381.Google Scholar
Sklar, M.(1959). Fonctions de répartition à n dimensions et leurs marges. Publ Inst Statist Univ Paris, 8,229231.Google Scholar
Takane, Y., & de Leeuw, J.(1987). On the relationship between item response theory and factor analysis of discretized variables. Psychometrika, 52(3), 393408.CrossRefGoogle Scholar
Tamer, E.(2010). Partial identification in econometrics. Annual Review of Economics, 2(1), 167195.CrossRefGoogle Scholar
Tankov, P.(2011). Improved Fréchet bounds and model-free pricing of multi-asset options. Journal of Applied Probability, 48(2), 389403.CrossRefGoogle Scholar
Tate, R. F.(1955). Applications of correlation models for biserial data. Journal of the American Statistical Association, 50(272), 10781095.CrossRefGoogle Scholar
Tate, R. F.(1955). The theory of correlation between two continuous variables when one is dichotomized. Biometrika, 42(1/2), 205216.CrossRefGoogle Scholar
Vaswani, S.(1950). Assumptions underlying the use of the tetrachoric correlation coefficient. Sankhyā: The Indian Journal of Statistics, 10(3), 269276.Google Scholar
Whitt, W.(1976). Bivariate distributions with given marginals. The Annals of Statistics, 4(6), 12801289.CrossRefGoogle Scholar
Yan, J.(2007). Enjoy the joy of copulas: With a package copula. Journal of Statistical Software, 21(4), 121.CrossRefGoogle Scholar
Supplementary material: File

Grønneberg et al. supplementary material

Online Appendix for “Partial Identification of Latent Correlations with Binary Data”
Download Grønneberg et al. supplementary material(File)
File 159.5 KB
Supplementary material: File

Grønneberg et al. supplementary material

Grønneberg et al. supplementary material 1
Download Grønneberg et al. supplementary material(File)
File 275.4 KB