Saddlepoint Approximations of the Distribution of the Person Parameter in the Two Parameter Logistic Model

Martin Biehler; Heinz Holling; Philipp Doebler

doi:10.1007/s11336-014-9405-1

Saddlepoint Approximations of the Distribution of the Person Parameter in the Two Parameter Logistic Model

Published online by Cambridge University Press: 01 January 2025

Martin Biehler ,

Heinz Holling and

Philipp Doebler

Show author details

Martin Biehler*: Affiliation:
Westfälische Wilhelms-Universität Münster
Heinz Holling: Affiliation:
Westfälische Wilhelms-Universität Münster
Philipp Doebler: Affiliation:
Westfälische Wilhelms-Universität Münster
*: Requests for reprints should be sent to Martin Biehler, Westfälische Wilhelms-Universität Münster, Münster, Germany. E-mail: Martin.A.Biehler@uni-giessen.de

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Large sample theory states the asymptotic normality of the maximum likelihood estimator of the person parameter in the two parameter logistic (2PL) model. In short tests, however, the assumption of normality can be grossly wrong. As a consequence, intended coverage rates may be exceeded and confidence intervals are revealed to be overly conservative. Methods belonging to the higher-order-theory, more specifically saddlepoint approximations, are a convenient way to deal with small-sample problems. Confidence bounds obtained by these means hold the approximate confidence level for a broad range of the person parameter. Moreover, an approximation to the exact distribution permits to compute median unbiased estimates (MUE) that are as likely to overestimate as to underestimate the true person parameter. Additionally, in small samples, these MUE are less mean-biased than the often-used maximum likelihood estimator.

Keywords

small-samples saddlepoint approximation Lugannani–Rice modified signed likelihood ratio statistic higher-order-theory parameter distribution 2PL exponential family person parameter confidence intervals median unbiased estimator mean-bias

Type: Original Paper
Information: Psychometrika , Volume 80 , Issue 3 , September 2015 , pp. 665 - 688

DOI: https://doi.org/10.1007/s11336-014-9405-1 [Opens in a new window]
Copyright: Copyright © 2014 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Electronic Supplementary Material The online version of this article (doi:10.1007/s11336-014-9405-1) contains supplementary material, which is available to authorized users.

References

Agresti, A. (2001). Exact inference for categorical data: recent advances and continuing controversies. Statistics in Medicine, 20, 2709–2722.CrossRef Google Scholar PubMed

Agresti, A. (2002). Categorical data analysis (2nd ed.). Hoboken: Wiley.CrossRef Google Scholar

Agresti, A., & Gottard, A. (2007). Nonconservative exact small-sample inference for discrete data. Computational Statistics & Data Analysis, 51, 6447–6458.CrossRef Google Scholar

Agresti, A., & Min, Y. (2001). On small-sample confidence intervals for parameters in discrete distributions. Biometrics, 57, 963–971.CrossRef Google Scholar PubMed

Aït-Sahalia, Y., & Yu, J. (2006). Saddlepoint approximations for continuous-time Markov processes. Journal of Econometrics, 134, 507–551.CrossRef Google Scholar

Baker, F.B., & Kim, S.H. (2004). Item response theory: parameter estimation techniques (2nd ed.). New York: CRC Press.CrossRef Google Scholar

Barndorff-Nielsen, O. (1986). Inference on full or partial parameters on the standardized signed log likelihood ratio. Biometrika, 73(2), 307–322.Google Scholar

Bedrick, E.J. (1997). Approximating the conditional distribution of person fit indexes for checking the Rasch model. Psychometrika, 62(2), 191–199.CrossRef Google Scholar

Bedrick, E.J., & Hill, J.R. (1992). An empirical assessment of saddlepoint approximations for testing a logistic regression parameter. Biometrics, 48(2), 529–544.CrossRef Google Scholar PubMed

Birnbaum, A. (1964). Median-unbiased estimators. Bulletin of Mathematical Statistics, 11, 25–34.CrossRef Google Scholar

Borsboom, D. (2006). The attack of the psychometricians. Psychometrika, 71(3), 425–440.CrossRef Google Scholar PubMed

Brazzale, A.R. (1999). Approximate conditional inference in logistic and loglinear models. Journal of Computational and Graphical Statistics, 8(3), 653–661.CrossRef Google Scholar

Brazzale, A.R., (2000). Practical small-sample parametric inference. Unpublished doctoral dissertation, Ecole Polytechnique Fédérale de Lausanne, Switzerland.Google Scholar

Brazzale, A.R. (2005). Hoa: An R package bundle for higher order likelihood inference. Rnews, 5(1), 20–27. (ISSN 609-3631).Google Scholar

Brazzale, A.R., & Davison, A.C. (2008). Accurate parametric inference for small samples. Statistical Science, 23(4), 465–484.CrossRef Google Scholar

Brazzale, A.R., Davison, A.C., & Reid, N. (2007). Applied asymptotics: case studies in small-sample statistics. Cambridge: Cambridge University Press.CrossRef Google Scholar

Brown, G.W. (1947). On small-sample estimation. The Annals of Mathematical Statistics, 18(4), 582–585.CrossRef Google Scholar

Butler, R.W. (2000). Reliabilities for feedback systems and their saddlepoint approximation. Statistical Science, 15(3), 279–298.CrossRef Google Scholar

Butler, R.W. (2007). Saddlepoint approximations with applications. New York: Cambridge University Press.CrossRef Google Scholar

Casella, G., & Berger, R. (2002). Statistical inference. Pacific Grove: Duxbury/Thomson Learning.Google Scholar

Chieffo, A., Stankovic, G., Bonizzoni, E., Tsagalou, E., Iakovou, I., & Montorfano, M. et al. (2005). Early and mid-term results of drug-eluting stent implantation in unprotected left main. Circulation, 111, 791–795.CrossRef Google Scholar PubMed

Cox, D. (2006). Principles of statistical inference. New York: Cambridge University Press.CrossRef Google Scholar

Davison, A.C. (2003). Statistical models. New York: Cambridge University Press.CrossRef Google Scholar

Davison, A.C. (1988). Approximate conditional inference in generalized linear models. Journal of the Royal Statistical Society Series B (Methodological), 50(3), 445–461.CrossRef Google Scholar

Davison, A.C., Fraser, D., & Reid, N. (2006). Improved likelihood inference for discrete data. Journal of the Royal Statistical Society Series B, 68 Part 3495–508.CrossRef Google Scholar

DeMars, C. (2010). Item response theory. New York: Oxford University Press.CrossRef Google Scholar

Doebler, A., Doebler, P., & Holling, H. (2013). Optimal and most exact confidence intervals for person parameters in item response theory models. Psychometrika, 78(1), 98–115.CrossRef Google Scholar PubMed

Essen, C.-G. (1945). Fourier analysis of distribution functions. A mathematical study of the Laplace–Gaussian law. Acta Mathematica, 77(1), 1–125.CrossRef Google Scholar

Fischer, G.H. (2007). Rasch models. In Rao, C., & Sinharay, S. (Eds.), Psychometrics (pp. 515–585). Amsterdam: North-Holland.Google Scholar

Fox, J.-P. (2010). Bayesian item response modeling. New York: Springer.CrossRef Google Scholar

Hall, P. (1982). Improving the normal approximation when constructing one-sided confidence intervals for binomial or Poisson parameters. Biometrika, 69(3), 647–652.CrossRef Google Scholar

Hall, P. (1992). On the removal of skewness by transformation. Journal of the Royal Statistical Society, Series B (Methodological), 54(1), 221–228.CrossRef Google Scholar

Hambleton, R.K., & Zhao, Y. (2005). Item response theory (IRT) models for dichotomous data. In Everitt, B., & Howell, D. (Eds.), Encyclopedia of statistics in behavioral science (pp. 982–990). Chichester: Wiley.Google Scholar

Hirji, K.F. (2006). Exact analysis of discrete data. Boca Raton: Chapman & Hall/CRC Press.Google Scholar

Hirji, K.F., Tsiatis, A.A., & Metha, C.R. (1989). Median unbiased estimation for binary data. American Statistician, 43(1), 7–11.CrossRef Google Scholar

Hoijtink, H., & Boomsma, A. (1995). On person parameter estimation in the dichotomous Rasch model. In Fischer, G.H., & Molenaar, I.W. (Eds.), Rasch models. Foundations, recent developments, and applications (pp. 54–68). New York: Springer.Google Scholar

Johnson, N.L., Kemp, A.W., & Kotz, S. (2005). Univariate discrete distributions (3rd ed.). Hoboken: Wiley.CrossRef Google Scholar

Kay, S., Nuttall, A., & Baggenstoss, P. (2001). Multidimensional probability density function approximations for detection, classification, and model order selection. IEEE Transactions on Signal Processing, 49(10), 2240–2252.CrossRef Google Scholar

Klauer, K.C. (1991). Exact and best confidence intervals for the ability parameter of the Rasch model. Psychometrika, 56(2), 535–547.CrossRef Google Scholar

Klauer, K.C. (1991). An exact and optimal standardized person test for assessing consistency with the Rasch model. Psychometrika, 56(2), 213–228.CrossRef Google Scholar

Kolassa, J. (1997). Infinite parameter estimates in logistic regression, with application to approximate conditional inference. Scandinavian Journal of Statistics, 24(4), 523–530.CrossRef Google Scholar

Lehmann, E. (1951). A general concept of unbiasedness. The Annals of Mathematical Statistics, 22(4), 587–592.CrossRef Google Scholar

Lehmann, E. (1999). Elements of large sample theory (1st ed.). New York: Springer.CrossRef Google Scholar

Lehmann, E., & Casella, G. (1998). Theory of point estimation (2nd ed.). New York: Springer. Hardcover.Google Scholar

Lehmann, E., & Romano, J. (2005). Testing statistical hypotheses (3rd ed.). New York: Springer.Google Scholar

Levin, B. (1990). The saddlepoint correction in conditional logistic likelihood analysis. Biometrika, 77(2), 275–285.CrossRef Google Scholar

Liou, M., & Yu, L.-C. (1991). Assessing statistical accuracy in ability estimation: a bootstrap approach. Psychometrika, 56(1), 55–67.CrossRef Google Scholar

Lord, F.M. (1983). Unbiased estimators of ability parameters, of their variance, and of their parallel-forms reliability. Psychometrika, 48(2), 233–245.CrossRef Google Scholar

Lugannani, R., & Rice, S. (1980). Saddle point approximation for the distribution of the sum of independent random variables. Advances in Applied Probability, 12(2), 475–490.CrossRef Google Scholar

Molenaar, I., & Hoijtink, H. (1990). The many null distributions of person fit indices. Psychometrika, 55(1), 75–106.CrossRef Google Scholar

Ogasawara, H. (2012). Asymptotic expansions for the ability estimator in item response theory. Computational Statistics, 27(4), 661–683.CrossRef Google Scholar

Ogasawara, H. (2013). Asymptotic properties of the bayes and pseudo bayes estimators of ability in item response theory. Journal of Multivariate Analysis, 114, 359–377.CrossRef Google Scholar

Pace, L., & Salvan, A. (1997). Principles of statistical inference from a neo-Fisherian perspective. Singapore: World Scientific.CrossRef Google Scholar

Pace, L., & Salvan, A. (1999). Point estimation based on confidence intervals: exponential families. Journal of Statistical Computation and Simulation, 64, 1–21.CrossRef Google Scholar

Pfanzagl, J. (1970). Median unbiased estimates for m.l.r.-families. Metrika, 15(1), 30–39.CrossRef Google Scholar

Pfanzagl, J. (1970). On the asymptotic efficiency of median unbiased estimates. The Annals of Mathematical Statistics, 41(5), 1500–1509.CrossRef Google Scholar

Pfanzagl, J. (1972). On median unbiased estimates. Metrika, 18(1), 154–173.CrossRef Google Scholar

Pierce, D.A., & Peters, D. (1992). Practical use of higher order asymptotics for multiparameter exponential families. Journal of the Royal Statistical Society Series B, 54(3), 701–737.CrossRef Google Scholar

R Development Core Team (2009). R: A language and environment for statistical computing [Computer software manual], Vienna, Austria. Available from http://www.R-project.org. (ISBN 3-900051-07-0).Google Scholar

Read, C.B. (2006). Median unbiased estimators. In Kotz, S., Norman, L.J., Balakrishnan, N., Read, C.B., & Brani, V. (Eds.), Encyclopedia of statistical sciences, (2nd ed., pp. 4713–4715). New York: Wiley-Interscience.Google Scholar

Reeve, B., & Mâsse, L. et al. (2004). Item response theory modeling for questionnaire evaluation. In Presser, S. et al. Methods for testing and evaluating survey questionnaires (pp. 247–273). Hoboken: Wiley.CrossRef Google Scholar

Reid, N. (1988). Saddlepoint methods and statistical inference. Statistical Science, 3(2), 213–238.Google Scholar

Rogers, L., & Zane, O. (1999). Saddlepoint approximations to option prices. The Annals of Applied Probability, 9(2), 493–503.CrossRef Google Scholar

Routledge, R. (1994). Practicing safe statistics with the mid-p ^∗. Canadian Journal of Statistics, 22(1), 103–110.CrossRef Google Scholar

Salvan, A., & Hirji, K. (1991). Asymptotic equivalence of conditional median unbiased and maximum likelihood estimators in exponential families. Metron, 49, 219–232.Google Scholar

Severini, T.A. (2000). Likelihood methods in statistic. New York: Oxford University Press.CrossRef Google Scholar

Small, C.G. (2010). Expansions and asymptotics for statistics. Boca Raton: Chapman & Hall/CRC Press.CrossRef Google Scholar

Srivastava, M., & Yau, W. (1989). Saddlepoint method for obtaining tail probability of Wilks’ likelihood ratio test. Journal of Multivariate Analysis, 31, 117–126.CrossRef Google Scholar

Stuart, A., & Ord, J. (1987). Kendall’s advanced theory of statistics (5th ed.). New York: Oxford University Press.Google Scholar

van der Linden, W.J., & Glas, G.A.W. (Eds.) (2000). Computerized adaptive testing: theory and practice. Dordrecht: Kluwer Academic.CrossRef Google Scholar

Wang, S., & Carroll, R.J. (1999). High-order accurate methods for retrospective sampling problems. Biometrika, 86(4), 881–897.CrossRef Google Scholar

Warm, T.A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54(3), 427–450.CrossRef Google Scholar

Young, G., & Smith, R. (2005). Essentials of statistical inference. New York: Cambridge University Press.CrossRef Google Scholar

Biehler et al. supplementary material

Appendix C: R-code

File 124 KB

Article contents

Saddlepoint Approximations of the Distribution of the Person Parameter in the Two Parameter Logistic Model

Abstract

Keywords

Access options

Footnotes

References

Biehler et al. supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests