Inequalities Between Kappa and Kappa-Like Statistics for k × k Tables

Matthijs J. Warrens

doi:10.1007/s11336-009-9138-8

Inequalities Between Kappa and Kappa-Like Statistics for k × k Tables

Published online by Cambridge University Press: 01 January 2025

Matthijs J. Warrens

Show author details

Matthijs J. Warrens*: Affiliation:
Leiden University
*: Requests for reprints should be sent to Matthijs J. Warrens, Institute of Psychology, Unit Methodology and Statistics, Leiden University, P.O. Box 9555, 2300 RB Leiden, The Netherlands. E-mail: warrens@fsw.leidenuniv.nl

Article contents

Abstract
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

The paper presents inequalities between four descriptive statistics that can be expressed in the form [P − E(P)]/[1 − E(P)], where P is the observed proportion of agreement of a k × k table with identical categories, and E(P) is a function of the marginal probabilities. Scott’s π is an upper bound of Goodman and Kruskal’s λ and a lower bound of both Bennett et al. S and Cohen’s κ. We introduce a concept for the marginal probabilities of the k×k table called weak marginal symmetry. Using the rearrangement inequality, it is shown that Bennett et al. S is an upper bound of Cohen’s κ if the k×k table is weakly marginal symmetric.

Keywords

Cohen’s kappa Bennett, Alpert and Goldstein’s S Goodman and Kruskal’s lambda Scott’s pi upper bound rearrangement inequality nominal agreement

Information

Type: Theory and Methods
Information: Psychometrika , Volume 75 , Issue 1 , March 2010 , pp. 176 - 185

DOI: https://doi.org/10.1007/s11336-009-9138-8 [Opens in a new window]
Creative Commons: This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
Copyright: Copyright © 2009 The Psychometric Society

References

Agresti, A. (1990). Categorical data analysis, New York: Wiley.Google Scholar

Agresti, A., & Winner, L. (1997). Evaluating agreement and disagreement among movie reviewers. Chance, 10, 10–14.CrossRef Google Scholar

Bennett, E.M., Alpert, R., & Goldstein, A.C. (1954). Communications through limited response questioning. Public Opinion Quarterly, 18, 303–308.CrossRef Google Scholar

Blackman, N.J.-M., & Koval, J.J. (1993). Estimating rater agreement in 2×2 tables: Correction for chance and intraclass correlation. Applied Psychological Measurement, 17, 211–223.CrossRef Google Scholar

Brennan, R.L., & Prediger, D.J. (1981). Coefficient kappa: Some uses, misuses, and alternatives. Educational and Psychological Measurement, 41, 687–699.CrossRef Google Scholar

Byrt, T., Bishop, J., & Carlin, J.B. (1993). Bias, prevalence and kappa. Journal of Clinical Epidemiology, 46, 423–429.CrossRef Google Scholar PubMed

Cohen, J.A. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 213–220.CrossRef Google Scholar

Conger, A.J. (1980). Integration and generalization of kappas for multiple raters. Psychological Bulletin, 88, 322–328.CrossRef Google Scholar

De Mast, J. (2007). Agreement and kappa-type indices. The American Statistician, 61, 148–153.CrossRef Google Scholar

Dou, W., Ren, Y., Wu, Q., Ruan, S., Chen, Y., Bloyet, D., & Constans, J.-M. (2007). Fuzzy kappa for the agreement measure of fuzzy classifications. Neurocomputing, 70, 726–734.CrossRef Google Scholar

Feinstein, A.R., & Cicchetti, D.V. (1990). High agreement but low kappa: I. The problems of two paradoxes. Journal of Clinical Epidemiology, 43, 543–548.CrossRef Google Scholar

Fleiss, J.L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76, 378–382.CrossRef Google Scholar

Fleiss, J.L. (1975). Measuring agreement between two judges on the presence or absence of a trait. Biometrics, 31, 651–659.CrossRef Google Scholar PubMed

Goodman, G.D., & Kruskal, W.H. (1954). Measures of association for cross classifications. Journal of the American Statistical Association, 49, 732–764.Google Scholar

Hardy, G.H., Littlewood, J.E., & Polya, G. (1988). Inequalities, (2nd ed.). Cambridge: Cambridge University Press.Google Scholar

Holley, J.W., & Guilford, J.P. (1964). A note on the G index of agreement. Educational and Psychological Measurement, 24, 749–753.CrossRef Google Scholar

Hubert, L.J., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2, 193–218.CrossRef Google Scholar

Janson, S., & Vegelius, J. (1979). On generalizations of the G index and the Phi coefficient to nominal scales. Multivariate Behavioral Research, 14, 255–269.CrossRef Google Scholar

Krippendorff, K. (1987). Association, agreement, and equity. Quality and Quantity, 21, 109–123.CrossRef Google Scholar

Krippendorff, K. (2004). Reliability in content analysis: Some common misconceptions and recommendations. Human Communication Research, 30, 411–433.Google Scholar

Maxwell, A.E. (1977). Coefficients between observers and their interpretation. British Journal of Psychiatry, 116, 651–655.CrossRef Google Scholar

Mitrinović, D.S. (1964). Elementary inequalities, Noordhoff: Groningen.Google Scholar

Scott, W.A. (1955). Reliability of content analysis: The case of nominal scale coding. Public Opinion Quarterly, 19, 321–325.CrossRef Google Scholar

Steinley, D. (2004). Properties of the Hubert-Arabie adjusted Rand index. Psychological Methods, 9, 386–396.CrossRef Google Scholar PubMed

Visser, H., & De Nijs, T. (2006). The map comparison kit. Environmental Modelling & Software, 21, 346–358.CrossRef Google Scholar

Warrens, M.J. (2008). On similarity coefficients for 2×2 tables and correction for chance. Psychometrika, 73, 487–502.CrossRef Google Scholar

Warrens, M.J. (2008). Bounds of resemblance measures for binary (presence/absence) variables. Journal of Classification, 25, 195–208.CrossRef Google Scholar

Warrens, M.J. (2008). On association coefficients for 2×2 tables and properties that do not depend on the marginal distributions. Psychometrika, 73, 777–789.CrossRef Google Scholar

Warrens, M.J. (2008). On the indeterminacy of resemblance measures for (presence/absence) data. Journal of Classification, 25, 125–136.CrossRef Google Scholar

Warrens, M.J. (2008). On the equivalence of Cohen’s kappa and the Hubert-Arabie adjusted Rand index. Journal of Classification, 25, 177–183.CrossRef Google Scholar

Zwick, R. (1988). Another look at interrater agreement. Psychological Bulletin, 103, 374–378.CrossRef Google Scholar

Article contents

Inequalities Between Kappa and Kappa-Like Statistics for k × k Tables

Abstract

Keywords

Information

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests