Hostname: page-component-745bb68f8f-hvd4g Total loading time: 0 Render date: 2025-01-07T16:05:45.863Z Has data issue: false hasContentIssue false

Reliability of Multiple Classifications

Published online by Cambridge University Press:  01 January 2025

Huynh Huynh*
Affiliation:
University of South Carolina
*
Requests for reprints should be addressed to Huynh Huynh, College of Education, University of South Carolina, Columbia, South Carolina 29208.

Abstract

Cohen's kappa index is reformulated for multiple classifications based on exchangeable random variables. It is found that kappa is between 0 and 1 inclusive. Two characterizations for kappa are stated in terms of the relationship between such random variables. Within the normal test score model, kappa increases with test reliability and test length. Furthermore, when based on binary classifications, kappa is an inverse U-shaped function of the cutoff score. These trends also hold for the beta-binomial test score model.

Type
Original Paper
Copyright
Copyright © 1978 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Paper read at the Spring meeting of the Psychometric Society, Bell Laboratories (Murray Hill, New Jersey), March 1976.

The editorial assistance and helpful comments of Leonard S. Feldt, Sarah P. Seaman-Huynh, and the three referees are gratefully acknowledged.

References

Birnbaum, A. Some latent trait models and their use in inferring an examinee's ability. In Lord, F. M. & Novick, M. R. (Eds.), Statistical theories of mental test scores, 1968, Reading, Massachusetts: Addison-Wesley Publishing Company.Google Scholar
Bishop, Y. M., Fienberg, S. & Holland, P. Discrete multivariate analysis: Theory and practice, 1974, Boston: M. I. T. Press.Google Scholar
Cochran, W. G. Errors in measurement in statistics. Technometrics, 1968, 10, 637666.CrossRefGoogle Scholar
Cohen, J. A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 1960, 20, 3746.CrossRefGoogle Scholar
DeFinetti, B. La prévision: ses lois logiques, ses resources subjectives. Annales de l'Institut Henri Poincaré, 1937, 7, 168.Google Scholar
Goodman, L. A. & Kruskal, W. H. Measures of association for cross classifications. Journal of the American Statistical Association, 1954, 49, 732764.Google Scholar
Goodman, L. A. & Kruskal, W. H. Measures of association for cross classifications. II: Further discussion and references. Journal of the American Statistical Association, 1959, 54, 123163.CrossRefGoogle Scholar
Goodman, L. A. & Kruskal, W. H. Measures of association for cross classifications. III: Approximate sampling theory. Journal of the American Statistical Association, 1963, 58, 310364.CrossRefGoogle Scholar
Goodman, L. A. & Kruskal, W. H. Measures of association for cross classifications. IV: Simplification of asymptotic variances. Journal of the American Statistical Association, 1972, 67, 415421.CrossRefGoogle Scholar
Griffiths, D. A. Maximum likelihood estimation for the beta-binomial distribution and an application to the household distribution of the total number of cases of a disease. Biometrics, 1973, 29, 637648.CrossRefGoogle Scholar
Hambleton, R. K. & Novick, M. R. Toward an integration of theory and method for criterion-referenced tests. Journal of Educational Measurement, 1973, 10, 159170.CrossRefGoogle Scholar
Hewitt, E. & Savage, L. J. Symmetric measures on cartesian products. Transactions of the American Mathematical Society, 1956, 80, 470501.CrossRefGoogle Scholar
Huynh, H. Reliability of decisions in domain-referenced testing. Journal of Educational Measurement, 1976, 13, 253264.CrossRefGoogle Scholar
Johnson, L. J. & Kotz, S. Distributions in statistics: Continuous multivariate distributions, 1972, New York: John Wiley & Sons.Google Scholar
Keats, J. A. & Lord, F. M. A theoretical distribution for mental test-score data. Psychometrika, 1962, 27, 5972.CrossRefGoogle Scholar
Lord, F. M. & Novick, M. R. Statistical theories of mental test scores, 1968, Reading, Massachusetts: Addison—Wesley Publishing Company.Google Scholar
McKinlay, S. M. The design and analysis of the observational study—A review. Journal of the American Statistical Association, 1975, 70, 503520.Google Scholar
Rao, C. R. Linear statistical inference and its applications, 1973, New York: John Wiley & Sons.CrossRefGoogle Scholar
Slepian, D. The one-sided barrier problem for Gaussian noise. Bell System Technical Journal, 1962, 41, 463501.CrossRefGoogle Scholar
Swaminathan, H., Hambleton, R. K., Algina, J. Reliability of criterion-referenced tests: A decision-theoretic formulation. Journal of Educational Measurement, 1974, 11, 263267.CrossRefGoogle Scholar
Zimmerman, D. W. Probability spaces, Hilbert spaces, and the axioms of test theory. Psychometrika, 1975, 40, 395412.CrossRefGoogle Scholar