On the Sampling Theory Roundations of Item Response Theory Models

Paul W. Holland

doi:10.1007/BF02294609

On the Sampling Theory Roundations of Item Response Theory Models

Published online by Cambridge University Press: 01 January 2025

Paul W. Holland

Show author details

Paul W. Holland*: Affiliation:
Educational Testing Service
*: Requests for reprints should be sent to Paul W. Holland, Educational Testing Service, Rosedale Road 21-T, Princeton, NJ 08541.

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Item response theory (IT) models are now in common use for the analysis of dichotomous item responses. This paper examines the sampling theory foundations for statistical inference in these models. The discussion includes: some history on the “stochastic subject” versus the random sampling interpretations of the probability in IRT models; the relationship between three versions of maximum likelihood estimation for IRT models; estimating θ versus estimating θ-predictors; IRT models and loglinear models; the identifiability of IRT models; and the role of robustness and Bayesian statistics from the sampling theory perspective.

Keywords

stochastic subjects marginal maximum likelihood (MML)conditional maximum likelihood (CML)unconditional maximum likelihood (UML)joint maximum likelihood (JML)probability simplex loglinear models robustness

Information

Type: Original Paper
Information: Psychometrika , Volume 55 , Issue 4 , December 1990 , pp. 577 - 601

DOI: https://doi.org/10.1007/BF02294609 [Opens in a new window]
Copyright: Copyright © 1990 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

A presidential address can serve many different functions. This one is a report of investigations I started at least ten years ago to understand what IRT was all about. It is a decidedly one-sided view, but I hope it stimulates controversy and further research. I have profited from discussions of this material with many people including: Brian Junker, Charles Lewis, Nicholas Longford, Robert Mislevy, Ivo Molenaar, Donald Rock, Donald Rubin, Lynne Steinberg, Martha Stocking, William Stout, Dorothy Thayer, David Thissen, Wim van der Linden, Howard Wainer, and Marilyn Wingersky. Of course, none of them is responsible for any errors or misstatements in this paper. The research was supported in part by the Cognitive Science Program, Office of Naval Research under Contract No. Nooo14-87-K-0730 and by the Program Statistics Research Project of Educational Testing Service.

References

Andersen, E. B. (1970). Asymptotic properties of conditional maximum likelihood estimators. Journal of the Royal Statistical Society, Series B, 32, 283–301.CrossRef Google Scholar

Andersen, E. B. (1980). Discrete statistical models with social science applications, Amsterdam: North Holland.Google Scholar

Birch, M. W. (1964). A new proof of the Pearson-Fisher theorem. Annals of Mathematical Statistics, 35, 718–824.CrossRef Google Scholar

Birnbaum, Z. W. (1967). Statistical theory for logistic mental test models with a prior distribution of ability, Princeton, NJ: Educational Testing Service.CrossRef Google Scholar

Bock, R. D. (1967, March). Fitting a response model for n dichotomous items. Paper read at the Psychometric Society Meeting, Madison, WI.Google Scholar

Bock, R. D., & Aitken, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46, 443–459.CrossRef Google Scholar

Bock, R. D., & Lieberman, M. (1970). Fitting a response model for n dichotomously scored items. Psychometrika, 35, 179–197.CrossRef Google Scholar

Bush, R. R., & Mosteller, F. (1955). Stochastic models for learning, New York: Wiley.CrossRef Google Scholar

Cressie, N., & Holland, P. W. (1983). Characterizing the manifest probabilities of latent trait models. Psychometrika, 48, 129–141.CrossRef Google Scholar

de Leeuw, J., & Verhelst, N. (1986). Maximum likelihood estimation in generalized Rasch models. Journal of Educational Statistics, 11, 183–196.CrossRef Google Scholar

Follman, D. A. (1988). Consistent estimation in the Rasch model based on nonparametric margins. Psychometrika, 53, 553–562.CrossRef Google Scholar

Guttman, L. (1941). The quantification of a class of attributes: A theory and method of scale construction. In Horst, P. et al. (Eds.), The prediction of personal adjustment (pp. 319–348). New York: Social Science Research Council.Google Scholar

Guttman, L. (1950). The basis for scalogram analysis. In Stoufer, S. A. et al. (Eds.), Studies in social psychology in World War II, Vol. 4, measurement and prediction (pp. 60–90). Princeton, NJ: Princeton University Press.Google Scholar

Haberman, S. J. (1977). Maximum likelihood estimates in exponential response models. Annals of Statistics, 5, 815–841.CrossRef Google Scholar

Holland, P. W. (1981). when are item response models consistent with observed data?. Psychometrika, 46, 79–92.CrossRef Google Scholar

Holland, P. W. (1990). The Dutch Identity: A new tool for the study of item response models. Psychometrika, 55, 5–18.CrossRef Google Scholar

Holland, P. W., & Rosenbaum, P. R. (1986). Conditional association and unidimensionality in monotone latent variable models. Annals of Statistics, 14, 1523–1543.CrossRef Google Scholar

Junker, B. W. (1988). Statistical aspects of a new latent trait model. Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign, Department of Statistics.Google Scholar

Junker, B. W. (1989). conditional association, essential independence and local independence, Unpublished manuscript, University of Illinois at Urbana-Champaign, Department of Statistics.Google Scholar

Junker, B. W. (in press). Essential independence and likelihood-based ability estimation for polytomous items. Psychometrika.Google Scholar

Lawley, D. N. (1943). On problems connected with item selection and test construction. Proceedings of the Royal Statistical Society of Edinburgh, 61, 273–287.Google Scholar

Lazarsfeld, P. F. (1950). The logical and mathematical foundations of latent structure analysis. In Stoufer, S. A. et al. (Eds.), Studies in social psychology in Wold War II, Vol. 4, measurement and prediction (pp. 362–412). Princeton, NJ: Princeton University Press.Google Scholar

Lazarsfeld, P. F. (1959). Latent structure analysis. In Koch, S. (Eds.), Psychology: A study of a science, Volume 3 (pp. 476–543). New York: McGraw Hill.Google Scholar

Leonard, T. (1975). Bayesian estimation methods for two-way contingency tables. Journal of the royal Statistical Society, Series B, 37, 23–37.CrossRef Google Scholar

Levine, M. V. (1989). Ability distribution, pattern probabilities and quasidensities, Champaign, IL: University of Illinois, Model Based Measurement Laboratory.Google Scholar

Lewis, C. (1985). Developments in nonparametric ability estimation. In Weiss, D. J. (Eds.), Proceedings of the 1982 IRT/CAT conference (pp. 105–122). Minneapolis, MN: University of Minnesota.Google Scholar

Lewis, C. (1990). A discrete, ordinal IRT model. Paper presented at the Annual Meeting of the American Educational Research Association, Boston, MA.Google Scholar

Lindsay, B., Clogg, C. C., & Grego, J. (in press). Semi-parametric estimation in the Rasch model and related exponential response models, including a simple latent class model for item analysis. Journal of the American Statistical Association.Google Scholar

Lord, F. M. (1952). A theory of test scores. Psychometrika Monograph No. 7, 17 (4, Pt. 2).Google Scholar

Lord, F. M. (1967). An analysis of the Verbal Scholastic Aptitude Test using Brinbaum's three-parametric logistic model, Princeton, NJ: Education Testing Service.Google Scholar

Lord, F. M. (1974). Estimation of latent ability and item parameters when they are omitted responses.. Psychometrika, 39, 247–264.CrossRef Google Scholar

Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores, Reading, MA: Addison-Wesley.Google Scholar

Mislevy, R., & Stocking, M. (1989). A consumer's guide to LOGIST and BILOG. Applied Psychological Measurement, 13, 57–75.CrossRef Google Scholar

Oakes, D. (1988). Semi-parametric models. In Kotz, S. & Johnson, N. L. (Eds.), Encyclopedia of statistical science, Volume 8 (pp. 367–369). New York: Wiley.Google Scholar

Rasch, G. (1960). Probabilistic medoels for some intelligence and attainment tests, Copenhagen: Nielson and Lydiche. (for Danmarks Paedagogiske Institut).Google Scholar

Rosenbaum, P. R. (1984). Testing the conditional independence and monotonicity assumptions of item response theory. Psychometrika, 49, 425–436.CrossRef Google Scholar

Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph No. 17, 33, (4, Pt. 2).Google Scholar

Samejima, F. (1972). A general model for free response data. Psychometrika Monograph No. 18, 34, (4, Pt. 2).Google Scholar

Samejima, F. (1983). Some methods and approaches of estimating the operating characteristics of discrete item responses. In Wainer, H. & Messick, S. (Eds.), Principals (sic) of modern psychological measurement (pp. 154–182). Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar

Stout, W. (1987). A nonparametric approach for assessing latent trait unidimensionality. Psychometrika, 52, 589–617.CrossRef Google Scholar

Stout, W. (1990). A new item response theory modeling approach with applications to unidimensionality assesment and ability estimation. Psychometrika, 55, 293–325.CrossRef Google Scholar

Thissen, D. (1982). Marginal maximum liklihood estimation for the one-parameter logistic model. Psychometrika, 47, 175–186.CrossRef Google Scholar

Tjur, T. (1982). A connection between Rasch's item analysis model and a multiplicative Poisson model. Scandinavian Journal of Statistics, 9, 23–30.Google Scholar

Tsao, R. (1967). A second order exponental model for multidimensional dichotomous contingency tables with applications in medical diagnosis. Unpublished doctoral disseration, Harvard University, Department of Statistics.Google Scholar

Tucker, L. R. (1964). Maximum validity of a test with equivlent items. Psychometrika, 11, 1–14.CrossRef Google Scholar

Wainer, H. et al. (1990). Computerized adaptive testing: A primer, Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar

Wright, B. D. (1977). Solving meassurement problems with the Rasch model. Journal of Educational Measurement, 14, 97–116.CrossRef Google Scholar

Wright, B. D., & Douglas, G. A. (1977). Best procedures for sample-free item analysis. Applied Psychological Measurement, 1, 281–295.CrossRef Google Scholar

Wright, B. D., & Stone, M. H. (1979). Best test design, Chicago: Mesa Press.Google Scholar

Article contents

On the Sampling Theory Roundations of Item Response Theory Models

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests