Commentary on Coefficient Alpha: A Cautionary Tale

Samuel B. Green; Yanyun Yang

doi:10.1007/s11336-008-9098-4

Commentary on Coefficient Alpha: A Cautionary Tale

Published online by Cambridge University Press: 01 January 2025

Samuel B. Green and

Yanyun Yang

Show author details

Samuel B. Green*: Affiliation:
Arizona State University
Yanyun Yang: Affiliation:
Florida State University
*: Requests for reprints should be sent to Samuel B. Green, Arizona State University, P.O. Box 870611, Tempe, AZ 85287-0611, USA. E-mail: samgreen@asu.edu

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

The general use of coefficient alpha to assess reliability should be discouraged on a number of grounds. The assumptions underlying coefficient alpha are unlikely to hold in practice, and violation of these assumptions can result in nontrivial negative or positive bias. Structural equation modeling was discussed as an informative process both to assess the assumptions underlying coefficient alpha and to estimate reliability

Keywords

reliability coefficient alpha violation of assumptions structural equation modeling

Information

Type: Theory and Methods
Information: Psychometrika , Volume 74 , Issue 1 , March 2009 , pp. 121 - 135

DOI: https://doi.org/10.1007/s11336-008-9098-4 [Opens in a new window]
Copyright: Copyright © 2008 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Becker, G. (2000). How important is transient error in estimating reliability? Going beyond simulation studies. Psychological Methods, 5, 370–379.CrossRef Google Scholar PubMed

Bentler, P.M., Woodward, J.A. (1980). Inequalities among lower bounds to reliability: With applications to test construction and factor analysis. Psychometrika, 45, 249–267.CrossRef Google Scholar

Bollen, K.A. (1989). Structural equations with latent variables, New York: Wiley.CrossRef Google Scholar

Cattell, R.B., Tsujioka, B. (1964). The importance of factor-trueness and validity, versus homogeneity and orthogonality in test scales. Educational and Psychological Measurement, 24, 3–30.CrossRef Google Scholar

Chen, F.F., West, S.G., Sousa, K.H. (2006). A comparison of bifactor and second-order models of quality of life. Multivariate Behavioral Research, 41, 189–224.CrossRef Google Scholar PubMed

Cortina, J.M. (1993). What is coefficient alpha? An examination of theory and applications. Journal of Applied Psychology, 78, 98–104.CrossRef Google Scholar

Crocker, L., Algina, J. (1986). Introduction to classical and modern test theory, New York: Holt, Rinehart, and Winston.Google Scholar

Cronbach, L.J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297–334.CrossRef Google Scholar

Feldt, L.S., Qualls, A.L. (1996). Bias in coefficient alpha arising from heterogeneity of test content. Applied Measurement in Education, 9, 277–286.CrossRef Google Scholar

Fleishman, J., Benson, J. (1987). Using LISREL to evaluate measurement models and scale reliability. Educational and Psychological Measurement, 47, 925–939.CrossRef Google Scholar

Gerbing, D.W., Anderson, J.C. (1988). An updated paradigm for scale development incorporating unidimensionality and its assessment. Journal of Marketing Research, 25, 186–192.CrossRef Google Scholar

Gessaroli, M.E., Folske, J.C. (2002). Generalizing the reliability of tests comprised of testlets. International Journal of Testing, 2, 277–295.CrossRef Google Scholar

Green, S.B. (2003). A coefficient alpha for test-retest data. Psychological Methods, 8, 88–101.CrossRef Google Scholar PubMed

Green, S.B., Hershberger, S.L. (2000). Correlated errors in true score models and their effect on coefficient alpha. Structural Equation Modeling, 7, 251–270.CrossRef Google Scholar

Green, S.B., & Yang, Y. (2009). Reliability of summed item scores using structural equation modeling: An alternative to coefficient alpha. Psychometrika, 94. doi: 10.1007/s11336-008-9099-3.CrossRef Google Scholar

Green, S.B., Lissitz, R.W., Mulaik, S.A. (1977). Limitations of coefficient alpha as an index of test unidimensionality. Educational and Psychological Measurement, 37, 827–838.CrossRef Google Scholar

Green, S.B., Akey, T.M., Fleming, K.K., Hershberger, S.L., Marquis, J.G. (1997). Effect of the number of scale points on chi-square fit indices in confirmatory factor analysis. Structural Equation Modeling, 4, 108–120.CrossRef Google Scholar

Guttman, L.A. (1945). A basis for analyzing test-retest reliability. Psychometrika, 10, 255–282.CrossRef Google Scholar PubMed

Hattie, J. (1985). Methodology review: Assessing unidimensionality of test and items. Applied Psychological Measurement, 9, 139–164.CrossRef Google Scholar

Horn, J.L. (1965). A rationale and a test for the number of factors in factor analysis. Psychometrika, 30, 179–185.CrossRef Google Scholar

Humphreys, L.G. (1985). General intelligence: An integration of factor, test, and simplex theory. In Wolman, B.B. (Eds.), Handbook of intelligence: Theories, measurements, and applications (pp. 15–35). New York: Wiley.Google Scholar

Jackson, P.H., Agunwamba, C.C. (1977). Lower bounds for the reliability of the total score on a test composed of non-homogeneous items: I. Algebraic lower bounds. Psychometrika, 42, 567–578.CrossRef Google Scholar

Jöreskog, K.G. (1971). Statistical analysis of sets of congeneric test. Psychometrika, 36, 109–133.CrossRef Google Scholar

Leary, L.F., Dorans, N.J. (1985). Implications for altering the context in which test items appear: A historical perspective on an immediate concern. Review of Educational Research, 55, 387–411.CrossRef Google Scholar

Lee, G., Frisbie, D.A. (1999). Estimating reliability under a generalizability theory model for test scores composed of testlets. Applied Measurement in Education, 12, 237–255.CrossRef Google Scholar

Lee, G., Dunbar, S.B., Frisbie, D.A. (2001). The relative appropriateness of eight measurement models for analyzing scores from tests composed of testlets. Educational and Psychological Measurement, 61, 958–975.CrossRef Google Scholar

Lord, F.M., Novick, M.R. (1968). Statistical theories of mental test scores, Reading: Addison-Wesley.Google Scholar

Lucke, J.F. (2005). “Rassling the hog” The influence of correlated item error on internal consistency, classical reliability, and congeneric reliability. Applied Psychological Measurement, pp. 106–125.CrossRef Google Scholar

Maxwell, A.E. (1968). The effect of correlated errors on estimates of reliability coefficients. Educational and Psychological Measurement, 28, 803–811.CrossRef Google Scholar

McDonald, R.P. (1981). The dimensionality of test and items. British Journal of Mathematical and Statistical Psychology, 34, 100–117.CrossRef Google Scholar

McDonald, R.P. (1999). Test theory: A unified approach, Hillsdale: Erlbaum.Google Scholar

Miller, M.B. (1995). Coefficient alpha: A basic introduction from the perspectives of classical test theory and structural equation modeling. Structural Equation Modeling, 2, 255–273.CrossRef Google Scholar

Novick, M.R., Lewis, C. (1967). Coefficient alpha and the reliability of composite measurements. Psychometrika, 32, 1–13.CrossRef Google Scholar PubMed

Ochieng, C.O. (2001). Effects of item order on consistency and precision under different ordering schemes in attitudinal scales: A case of physical self-concept scales (Paper No. ESQESS-2001-3). University of British Columbia. Edgeworth Laboratory for Quantitative Educational and Social Science, Vancouver, B.C.Google Scholar

Raykov, T. (1997). Estimation of composite reliability for congeneric measures. Applied Psychological Measurement, 21, 173–184.CrossRef Google Scholar

Raykov, T. (1998). Coefficient alpha and composite reliability with interrelated nonhomogeneous items. Applied Psychological Measurement, 22, 375–385.CrossRef Google Scholar

Raykov, T. (2001). Bias of coefficient α for fixed congeneric measures with correlated errors. Applied Psychological Measurement, 25, 69–76.CrossRef Google Scholar

Raykov, T., Shrout, P. (2002). Reliability of scales with general structure: Point and interval estimation using a structural equation modeling approach. Structural Equation Modeling, 9, 195–212.CrossRef Google Scholar

Reise, S.P., Waller, N.G., Comrey, A.L. (2000). Factor analysis and scale revision. Psychological Assessment, 12, 287–297.CrossRef Google Scholar PubMed

Reise, S.P., Morizot, J., Hays, R.D. (2007). The role of the bifactor model in resolving dimensionality issues in health outcomes measures. Quality of Life Research, 16, 19–31.CrossRef Google Scholar PubMed

Rindskopf, D., Rose, T. (1988). Some theory and applications of confirmatory second-order factor analysis. Multivariate Behavioral Research, 23, 51–67.CrossRef Google Scholar

Rozeboom, W.W. (1966). Foundations of the theory of prediction, Homewood: Dorsey.Google Scholar

Rozeboom, W.W. (1989). The reliability of a linear composite of nonequivalent subtests. Applied Psychological Measurement, 13, 277–283.CrossRef Google Scholar

Roznowski, M., Tucker, L.R., Humphreys, L.G. (1991). Three approaches to determining the dimensionality of binary items. Applied Psychological Measurement, 15, 109–127.CrossRef Google Scholar

Schmid, J., Leiman, J.M. (1957). The development of hierarchical factor solutions. Psychometrika, 22, 53–61.CrossRef Google Scholar

Schurr, K.T., Henriksen, L.W. (1983). Effects of item sequencing and grouping in low-inference type questionnaires. Journal of Educational Measurement, 20, 379–391.CrossRef Google Scholar

Sijtsma, K. (2009). On the use, the misuse, and the very limited usefulness of Cronbach’s alpha. Psychometrika, 94. doi: 10.1007/s11336-008-9101-0.CrossRef Google Scholar

Sireci, S.G., Thissen, D., Wainer, H. (1991). On the reliability of testlet-based tests. Journal of Educational Measurement, 28, 237–247.CrossRef Google Scholar

Sparfeldt, J.E., Schilling, S.R., Rost, D.H. (2006). Blocked versus randomized format of questionnaires: A confirmatory. Educational and Psychological Measurement, 66, 961–974.CrossRef Google Scholar

Steinberg, L. (2001). The consequences of pairing questions: Context effects in personality measurement. Journal of Personality and Social Psychology, 81, 332–342.CrossRef Google Scholar PubMed

Steinberg, L., Thissen, D. (1996). Uses of item response theory and the testlet concept in the measurement of psychopathology. Psychological Methods, 1, 81–97.CrossRef Google Scholar

Ten Berge, J.M.F., Kiers, H.A.L. (1991). A numerical approach to the exact and the approximate minimum rank of a covariance matrix. Psychometrika, 56, 309–315.CrossRef Google Scholar

Ten Berge, J.M.F., & Kiers, H.A.L. (2003). The minimum rank factor analysis program MRFA. Internal report, Department of Psychology, University of Groningen, The Netherlands.Google Scholar

Veres, J.G., Sims, R.R., Locklear, T.S. (1991). Improving the reliability of Kolb’s revised learning style inventory. Educational & Psychological Measurement, 51, 143–150.CrossRef Google Scholar

Wainer, H., Kiely, G.L. (1987). Item clusters and computerized adaptive testing: A case of testlets. Journal of Educational Measurement, 24, 185–201.CrossRef Google Scholar

Woodhouse, B., Jackson, E.H. (1977). Lower bounds for the reliability of a test composed of nonhomogeneous items II: A search procedure to locate the greatest lower bound. Psychometrika, 42, 579–591.CrossRef Google Scholar

Yang, Y., & Green, S.B. (2007). Coefficient alpha and SEM estimates of reliability. Presented at annual meeting of the American Educational Research Association.Google Scholar

Yen, W.M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model. Applied Psychological Measurement, 8, 125–145.CrossRef Google Scholar

Yen, W.M. (1993). Scaling performance assessments: Strategies for managing local item dependence. Journal of Educational Measurement, 30, 187–214.CrossRef Google Scholar

Yung, Y.F., Thissen, D., McLeod, L.D. (1999). On the relationship between the higher-order factor model and the hierarchical factor model. Psychometrika, 64, 113–128.CrossRef Google Scholar

Zimmerman, D.W., Zumbo, R.D., Lalonde, C. (1993). Coefficient alpha as an estimate of test reliability under violation of two assumptions. Educational and Psychological Measurement, 53, 33–49.CrossRef Google Scholar

Zinbarg, R.E., Revelle, W., Yovel, I., Li, W. (2005). Cronbach’s α, Revelle’s β, and McDonald’s ω _H: Their relations with each other and two alternative conceptualizations of reliability. Psychometrika, 70, 123–133.CrossRef Google Scholar

Zinbarg, R.E., Revelle, W., Yovel, I. (2007). Estimating ω _h for structures containing two group factors: Perils and prospects. Applied Psychological Measurement, 15, 135–157.CrossRef Google Scholar

Zumbo, B.D., Rupp, A.A. (2004). Responsible modeling of measurement data for appropriate inferences: Important advances in reliability and validity theory. In Kaplan, D. (Eds.), The SAGE handbook of quantitative methodology for the social sciences (pp. 73–92). Thousand Oaks: Sage.Google Scholar

Zwick, W.R., Velicer, W.F. (1986). Comparison of five rules for determining the number of components to retain. Psychological Bulletin, 99, 432–442.CrossRef Google Scholar

Article contents

Commentary on Coefficient Alpha: A Cautionary Tale

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests