Hostname: page-component-5f745c7db-nc56l Total loading time: 0 Render date: 2025-01-07T00:00:24.492Z Has data issue: true hasContentIssue false

Statistical Inference for Multiple Choice Tests

Published online by Cambridge University Press:  01 January 2025

John S. J. Hsu*
Affiliation:
Department of Statistics and Applied Probability, The University of California, Santa Barbara
Tom Leonard
Affiliation:
Department of Statistics, The University of Wisconsin, Madison
Kam-Wah Tsui
Affiliation:
Department of Statistics, The University of Wisconsin, Madison
*
Requests for reprints should be sent to John S.J. Hsu, Department of Statistics and Applied Probability, University of California-Santa Barbara, Santa Barbara, CA 93106.

Abstract

Finite sample inference procedures are considered for analyzing the observed scores on a multiple choice test with several items, where, for example, the items are dissimilar, or the item responses are correlated. A discrete p-parameter exponential family model leads to a generalized linear model framework and, in a special case, a convenient regression of true score upon observed score. Techniques based upon the likelihood function, Akaike's information criteria (AIC), an approximate Bayesian marginalization procedure based on conditional maximization (BCM), and simulations for exact posterior densities (importance sampling) are used to facilitate finite sample investigations of the average true score, individual true scores, and various probabilities of interest. A simulation study suggests that, when the examinees come from two different populations, the exponential family can adequately generalize Duncan's beta-binomial model. Extensions to regression models, the classical test theory model, and empirical Bayes estimation problems are mentioned. The Duncan, Keats, and Matsumura data sets are used to illustrate potential advantages and flexibility of the exponential family model, and the BCM technique.

Type
Article
Copyright
Copyright © 1991 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

The authors wish to thank Ella Mae Matsumura for her data set and helpful comments, Frank Baker for his advice on item response theory, Hirotugu Akaike and Taskin Atilgan, for helpful discussions regarding AIC, Graham Wood for his advice concerning the class of all binomial mixture models, Yiu Ming Chiu for providing useful references and information on tetrachoric models, and the Editor and two referees for suggesting several references and alternative approaches.

References

Akaike, H. (1978). A Bayesian analysis of the minimum AIC procedure. Annals of the Institute of Statistical Mathematics, 30(A), 914.CrossRefGoogle Scholar
Altham, P. M. E. (1978). Two generalizations of the binomial distribution. Applied Statistics, 27, 162167.CrossRefGoogle Scholar
Anderson, D. A., & Aitken, M. (1985). Marginal maximum likelihood estimation of item parameters: Application of an algorithm. Journal of Royal Statistical Society, Series B, 26, 203210.CrossRefGoogle Scholar
Atilgan, T. (1983). Parameter parsimony, model selection, and smooth density estimation. Unpublished doctoral dissertation, University of Wisconsin-Madison.Google Scholar
Atilgan, T., Leonard, T., & Gupta, A. K. (1988). On the application of AIC to bivariate density estimation, non-parametric regression, and discrimination. In Bozadogan, H. (Eds.), Multivariate statistical modeling and data analysis (pp. 116). Dordrecht, Holland: Reidel.Google Scholar
Bell, S. S. (1990). Empirical Bayes alternatives to the beta-binomial model. Unpublished doctoral dissertation, Columbia University.Google Scholar
Bock, R. D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 37, 2951.CrossRefGoogle Scholar
Bock, R. D., & Aitken, M. (1981). Marginal maximum likelihood estimation of item parameters: application of an EM algorithm. Psychometrika, 46, 443454.CrossRefGoogle Scholar
Carter, M. C., & Williford, W. O. (1975). Estimation in a modified binomial distribution. Applied Statistics, 24, 319328.CrossRefGoogle Scholar
Consul, P. C. (1974). A simple urn model dependent upon predetermined strategy. Sankhya, Series B, 36, 391399.Google Scholar
Consul, P. C. (1975). On a characterization of Lagrangian Poisson and quasi-binomial distributions. Communications in Statistics, 4, 555563.CrossRefGoogle Scholar
Dalal, S. R., & Hall, W. J. (1983). Approximating priors by mixtures of natural conjugate priors. Journal of Royal Statistical Society, Series B, 45, 278286.CrossRefGoogle Scholar
Duncan, G. T. (1974). An empirical Bayes approach to scoring multiple-choice tests in the misinformation model. Journal of the American Statistical Association, 69, 5057.CrossRefGoogle Scholar
Gelfand, A. E., & Smith, A. F. M. (1990). Sampling based approaches to calculating marginal densities. Journal of the American Statistical Association, 85, 398409.CrossRefGoogle Scholar
Geweke, J. (1988). Antithetic acceleration of Monte-Carlo integration in Bayesian inference. Journal of Econometrics, 38, 7389.CrossRefGoogle Scholar
Geweke, J. (1989). Exact predictive density for linear models with arch distributions. Journal of Econometrics, 40, 6386.CrossRefGoogle Scholar
Hsu, J. S.J. (1990). Bayesian inference and marginalization. Unpublished doctoral dissertation, University of Wisconsin-Madison.Google Scholar
Keats, J. A. (1964). Some generalizations of a theoretical distribution of mental test scores. Psychometrika, 29, 215231.CrossRefGoogle Scholar
Lehmann, E. L. (1983). Theory of point estimation, New York: John Wiley & Sons.CrossRefGoogle Scholar
Leonard, T. (1972). Bayesian methods for binomial data. Biometrika, 59, 581589.CrossRefGoogle Scholar
Leonard, T. (1973). A Bayesian method for histograms. Biometrika, 60, 297308.Google Scholar
Leonard, T. (1982). Comment on the paper by Lejeune and Faulkenberry. Journal of the American Statistical Association, 77, 657658.Google Scholar
Leonard, T. (1984). Some data-analytic modifications to Bayes-Stein estimation. Annals of the Institute of Statistical Mathematics, 36, 1121.CrossRefGoogle Scholar
Leonard, T., Hsu, J. S.J., & Tsui, K. (1989). Bayesian marginal inference. Journal of the American Statistical Association, 84, 10511058.CrossRefGoogle Scholar
Leonard, T., & Novick, J. B. (1986). Bayesian full rank marginalization for two-way contingency tables. Journal of Educational Statistics, 11, 3356.CrossRefGoogle Scholar
Lord, F. M. (1965). A strong true-score theory, with applications. Psychometrika, 30, 239270.CrossRefGoogle Scholar
Lord, F. M. (1969). Estimating true-score distributions in psychological testing: An empirical Bayes estimation problem. Psychometrika, 34, 259299.CrossRefGoogle Scholar
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores (with contributions by Allen Birnbaum), Reading, MA: Addison-Wiley.Google Scholar
Lord, F. M., & Stocking, M. L. (1976). An interval estimate for making statistical inference about true scores. Psychometrika, 41, 7987.CrossRefGoogle Scholar
McCullagh, P., & Nelder, J. A. (1985). Generalized linear models, New York: Chapman and Hall.Google Scholar
Mislevy, R. J. (1986). Bayes modal estimation in item response. Psychometrika, 51, 177195.CrossRefGoogle Scholar
Morrison, D. G., & Brockway, G. (1979). A modified beta-binomial model with applications to multiple choice and taste tests. Psychometrika, 44, 427442.CrossRefGoogle Scholar
Prentice, R. L., & Barlow, W. E. (1988). Correlated binary regression with covariates specific to each binary observation. Biometrics, 44, 1033–48.CrossRefGoogle ScholarPubMed
Rubinstein, R. Y. (1981). Simulation and the Monte Carlo method, New York: John Wiley and Sons.CrossRefGoogle Scholar
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Mathematical Statistics, 6, 461464.Google Scholar
Takane, Y., Bozdogan, H., & Shibayama, T. (1987). Ideal point discriminant analysis. Psychometrika, 52, 371392.CrossRefGoogle Scholar
Wilcox, R. R. (1981). A review of the beta-binomial model and its extensions. Journal of Educational Statistics, 6, 332.CrossRefGoogle Scholar
Wilcox, R. R. (1981). A cautionary note on estimating the reliability of a mastery test with the beta-binomial model. Applied Psychological Measurement, 5, 531537.CrossRefGoogle Scholar
Young, A. S. (1977). A Bayesian approach to prediction using polynomials. Biometrika, 64, 309318.CrossRefGoogle Scholar