Seeking a Balance Between the Statistical and Scientific Elements in Psychometrics

Mark Wilson

doi:10.1007/s11336-013-9327-3

Seeking a Balance Between the Statistical and Scientific Elements in Psychometrics

Published online by Cambridge University Press: 01 January 2025

Mark Wilson

Show author details

Mark Wilson*: Affiliation:
University of California, Berkeley
*: Requests for reprints should be sent to Mark Wilson, University of California, Berkeley, Berkeley, CA, USA. E-mail: markw@berkeley.edu

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

In this paper, I will review some aspects of psychometric projects that I have been involved in, emphasizing the nature of the work of the psychometricians involved, especially the balance between the statistical and scientific elements of that work. The intent is to seek to understand where psychometrics, as a discipline, has been and where it might be headed, in part at least, by considering one particular journey (my own). In contemplating this, I also look to psychometrics journals to see how psychometricians represent themselves to themselves, and in a complementary way, look to substantive journals to see how psychometrics is represented there (or perhaps, not represented, as the case may be). I present a series of questions in order to consider the issue of what are the appropriate foci of the psychometric discipline. As an example, I present one recent project at the end, where the roles of the psychometricians and the substantive researchers have had to become intertwined in order to make satisfactory progress. In the conclusion I discuss the consequences of such a view for the future of psychometrics.

Keywords

psychometrics test theory test construction

Type: Original Paper
Information: Psychometrika , Volume 78 , Issue 2 , April 2013 , pp. 211 - 236

DOI: https://doi.org/10.1007/s11336-013-9327-3 [Opens in a new window]
Copyright: Copyright © 2013 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Adams, R.J., Wilson, M., Wu, M. (1997). Multilevel item response models: an approach to errors in variables regression. Journal of Educational and Behavioral Statistics, 22(1), 4776CrossRef Google Scholar

Adams, R.J., Wilson, M., Wang, W.C. (1997). The multidimensional random coefficients multinomial logit. Applied Psychological Measurement, 21, 1–23CrossRef Google Scholar

Adams, R.J., Wu, M., & Wilson, M. (2012). ConQuest 3.0 [computer program]. Hawthorn, Australia: ACER. Google Scholar

Acton, G.S., Kunz, J.D., Wilson, M., Hall, S.M. (2005). The construct of internalization: conceptualization, measurement, and prediction of smoking treatment outcome. Psychological Medicine, 35, 395–408CrossRef Google Scholar PubMed

American Educational Research Association, American Psychological Association, National Council for Measurement in Education (AERA, APA, NCME) (1999). Standards for educational and psychological testing, Washington: American Educational Research AssociationGoogle Scholar

American Institutes for Research (2000). Voluntary national test, cognitive laboratory report, year 2, Palo Alto: American Institutes for ResearchGoogle Scholar

Biggs, J.B., Collis, K.F. (1982). Evaluating the quality of learning: the SOLO taxonomy, New York: Academic PressGoogle Scholar

Borsboom, D. (2006). The attack of the psychometricians. Psychometrika, 71(3), 425440CrossRef Google Scholar PubMed

Brown, N.J.S., Wilson, M. (2011). Model of cognition: the missing cornerstone of assessment. Educational Psychology Review, 23(2), 221234CrossRef Google Scholar

Corcoran, T., Mosher, F.A., & Rogat, A. (2009). Learning progressions in science: an evidence-based approach to reform (CPRE Research Report #RR-63). New York: Center on Continuous Instructional Improvement, Teachers College—Columbia University. Google Scholar

De Boeck, P., Wilson, M., Acton, G.S. (2005). A conceptual and psychometric framework for distinguishing categories and dimensions. Psychological Review, 112(1), 129158CrossRef Google Scholar PubMed

Demetriou, A., Efklides, A. (1989). The person’s conception of the structures of developing intellect: early adolescence to middle age. Genetic, Social, and General Psychology Monographs, 115, 371–423Google Scholar PubMed

Demetriou, A., Kyriakides, L. (2006). The functional and developmental organization of cognitive developmental sequences. British Journal of Educational Psychology, 76(2), 209242CrossRef Google Scholar PubMed

Dempster, A.P., Laird, N.M., Rubin, D.B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B, 39, 1–38CrossRef Google Scholar

Diakow, R., & Irribarra, D.T. (2011). Developing assessments of data modeling and mapping a learning progression using a structured constructs model. Paper presented at the international meeting of the psychometric society, Hong Kong, July 2011. Google Scholar

Diakow, R., Irribarra, D.T., & Wilson, M. (2011). Analyzing the complex structure of a learning progression: structured construct models. Paper presented at the annual meeting of the national council of measurement in education, New Orleans, LA, April 2011. Google Scholar

Diakow, R., Irribarra, D.T., & Wilson, M. (2012a). Analyzing the complex structure of a learning progression: structured construct models. Paper presented at the national council on measurement in education annual meeting, Vancouver, Canada, April 2012. Google Scholar

Diakow, R., Irribarra, D.T., & Wilson, M. (2012b). Evaluating the impact of alternative models for between and within construct relations. Paper presented at the international meeting of the psychometric society, Lincoln, Nebraska, July 2012. Google Scholar

Draney, K. (1996). The polytomous saltus model: a mixture model approach to the diagnosis of developmental differences. Unpublished doctoral dissertation, University of California, Berkeley. Google Scholar

Draney, K., Jeon, M. (2011). Investigating the saltus model as a tool for setting standards. Psychological Test and Assessment Modeling, 53(4), 486498Google Scholar

Draney, K., Wilson, M. (2004). Application of the polytomous saltus model to stage-like data. In van der Ark, A., Croon, M., Sijtsma, K. (Eds.), New developments in categorical data analysis for the social and behavioral sciences, Mahwah: ErlbaumGoogle Scholar

Falmagne, J.-C., Doignon, J.-P. (2011). Learning spaces, Heidelberg: SpringerCrossRef Google Scholar

Fischer, K.W., Pipp, S.L., Bullock, D. (1984). Detecting discontinuities in development: methods and measurement. In Emde, R.N., Harmon, R. (Eds.), Continuities and discontinuities in development, Norwood: AblexGoogle Scholar

Irribarra, D.T., Diakow, R., & Wilson, M. (2012). Alternative specifications for structured construct models. Paper presented at the IOMW 2012 conference, Vancouver, April 2012. Google Scholar

Lehrer, R., Kim, M.-J., Ayers, E., & Wilson, M. (2013, in press). Toward establishing a learning progression to support the development of statistical reasoning. In J. Confrey & A. Maloney (Eds.), Learning over time: learning trajectories in mathematics education. Charlotte: Information Age Publishers. Google Scholar

Marton, F. (1981). Phenomenography: describing conceptions of the world around us. Instructional Science, 10, 177–200CrossRef Google Scholar

Marton, F. (1986). Phenomenography—a research approach to investigating different understandings of reality. Journal of Thought, 21, 29–49Google Scholar

Marton, F. (1988). Phenomenography—exploring different conceptions of reality. In Fetterman, D. (Eds.), Qualitative approaches to evaluation in education, New York: Praeger 176205Google Scholar

Masters, G.N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–174CrossRef Google Scholar

Mislevy, R.J., Steinberg, L.S., Almond, R.G. (2003). On the structure of educational assessments. Measurement Interdisciplinary Research & Perspective, 1, 3–67CrossRef Google Scholar

Mislevy, R.J., Wilson, M. (1996). Marginal maximum likelihood estimation for a psychometric model of discontinuous development. Psychometrika, 61, 41–71CrossRef Google Scholar

National Research Council (2001). Knowing what students know: the science and design of educational assessment. Committee on the Foundations of Assessment, J. Pellegrino, N. Chudowsky, & R. Glaser (Eds.), Washington: National Academy Press. Google Scholar

Nunnally, J.C., Bernstein, I.H. (1994). Psychometric theory, (3rd ed.). New York: McGraw-HillGoogle Scholar

Patton, M.Q. (1980). Qualitative evaluation methods, Beverly Hills: SageGoogle Scholar

Pirolli, P., Wilson, M. (1998). A theory of the measurement of knowledge content, access, and learning. Psychological Review, 105(1), 5882CrossRef Google Scholar

Rasch, G. (1961). On general laws and the meaning of measurement in psychology. Proceedings of the fourth Berkeley symposium on mathematical statistics and probability, 321334Google Scholar

Rasch, G. (1980). Probabilistic models for some intelligence and attainment tests, Chicago: University of Chicago Press (original work published 1960)Google Scholar

Rost, J. (1990). Rasch models in latent classes: an integration of two approaches to item analysis. Applied Psychological Measurement, 14, 271–282CrossRef Google Scholar

Rupp, A.A., Templin, J., Henson, R. (2010). Diagnostic measurement: theory, methods, and applications, New York: The Guilford PressGoogle Scholar

Scalise, K., & Gifford, B.R. (2008). Innovative item types: intermediate constraint questions and tasks for computer-based testing. Paper presented at the national council on measurement in education (NCME), session on ‘Building adaptive and other computer-based tests’, in New York, May 2008. Google Scholar

Schwartz, R., Ayers, E., & Wilson, M. (2010). Modeling a multi-dimensional learning progression. Paper presented at the annual meeting of the American educational research association, Denver, CO, April 2010. Google Scholar

Siegler, R.S. (1981). Developmental sequences within and between concepts. Monograph of the Society for Research in Child Development, 46(2, Serial No. 189). CrossRef Google Scholar

Spiegelhalter, D.J., Best, N.G., Carlin, B.P., van der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society, Series B, 64, 583–616CrossRef Google Scholar

Vermunt, J.K., & Magidson, J. (2007). Latent GOLD 4.5 syntax module (computer program). Belmont, MA: Statistical Innovations. Google Scholar

Wilson, M. (1989). Saltus: a psychometric model of discontinuity in cognitive development. Psychological Bulletin, 105(2), 276289CrossRef Google Scholar

Wilson, M. (2005). Constructing measures: an item response modeling approach, Mahwah: Lawrence Erlbaum AssociatesGoogle Scholar

Wilson, M. (2009). Measuring progressions: assessment structures underlying a learning progression. Journal for Research in Science Teaching, 46(6), 716730CrossRef Google Scholar

Wilson, M. (2012). Responding to a challenge that learning progressions pose to measurement practice: hypothesized links between dimensions of the outcome progression. In Alonzo, A.C., Gotwals, A.W. (Eds.), Learning progressions in science, Rotterdam: Sense PublishersGoogle Scholar

Article contents

Seeking a Balance Between the Statistical and Scientific Elements in Psychometrics

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests