Hostname: page-component-5f745c7db-sbzbt Total loading time: 0 Render date: 2025-01-06T07:09:28.055Z Has data issue: true hasContentIssue false

Criterion-Related Construct Validity

Published online by Cambridge University Press:  01 January 2025

Paul R. Rosenbaum*
Affiliation:
University of Pennsylvania
*
Requests for reprints should be sent to Paul R. Rosenbaum, Statistics Department, Wharton School, University of Pennsylvania, Philadelphia, PA 19104-6302.

Abstract

Established results on latent variable models are applied to the study of the validity of a psychological test. When the test predicts a criterion by measuring a unidimensional latent construct, not only must the total score predict the criterion, but the joint distribution of criterion scores and item responses must exhibit a certain pattern. The presence of this population pattern may be tested with sample data using the stratified Wilcoxon rank sum test. Often, criterion information is available only for selected examinees, for instance, those who are admitted or hired. Three cases are discussed: (i) selection at random, (ii) selection based on the current test, and (iii) selection based on other measures of the latent construct. Discriminant validity is also discussed.

Type
Original Paper
Copyright
Copyright © 1989 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

This work was supported in part by Grant SES-87-01890 from the Measurement Methods and Data Improvement Program of the U.S. National Science Foundation.

References

Bartholomew, D. (1980). Factor analysis for categorical data (with Discussion). Journal of the Royal Statistical Society, 42, 293321.CrossRefGoogle Scholar
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee's ability (Part 5). In Lord, F., Novick, M. (Eds.), Statistical theories of mental test scores, Reading, MA: Addison-Wesley.Google Scholar
Bock, D., & Lieberman, M. (1970). Fitting a response model for n dichotomously scored times. Psychometrika, 35, 179–97.CrossRefGoogle Scholar
Campbell, D., & Fiske, D. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81105.CrossRefGoogle ScholarPubMed
Cronbach, L. (1971). Test Validation. In Throndike, R. L. (Eds.), Educational Measurement, Washington, DC: National Council on Research in Education.Google Scholar
Cronbach, L., & Meehl, P. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281302.CrossRefGoogle ScholarPubMed
Holland, P. (1981). When are item response models consistent with observed data?. Psychometrika, 46, 7992.CrossRefGoogle Scholar
Holland, P., & Rosenbaum, P. (1986). Conditional association and unidimensionality in monotone latent variable models. Annals of Statistics, 14, 15231543.CrossRefGoogle Scholar
Lehmann, E. (1951). Consistency and unbiasedness of certain nonparametric tests. Annals of Mathematical Statistics, 22, 165179.CrossRefGoogle Scholar
Lehmann, E. (1966). Some concepts of dependence. Annals of Mathematical Statistics, 37, 11371153.CrossRefGoogle Scholar
Lord, F. (1977). A study of item bias, using item characteristic curve theory. In Poortinga, Y. H. (Eds.), Basic problems in cross-cultural psychology (pp. 1929). Amsterdam: Swets and Zeitlinger.Google Scholar
Lord, F. (1980). Applications of item response theory to practical testing problems, Hillsdale, NJ: Erlbaum.Google Scholar
Lord, F., & Novick, M. (1968). Statistical theories of mental test scores, Reading, MA: Addison-Wesley.Google Scholar
Mantel, N., & Haenszel, W. (1959). Statistical aspects of retrospective studies of disease. Journal of the National Cancer Institute, 22, 719748.Google ScholarPubMed
Messick, S. (1980). Test validity and the ethics of assessment. American Psychologist, 35, 10121027.CrossRefGoogle Scholar
Miller, R. (1981). Simultaneous statistical inference, New York: Springer-Verlag.CrossRefGoogle Scholar
Popper, K. (1959). The logic of scientific discovery, New York: Harper and Row.Google Scholar
Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests, Copenhagen: Neilson and Lydiche.Google Scholar
Rosenbaum, P. (1984). Testing the conditional independence and monotonicity assumptions of item response theory. Psychometrika, 49, 425435.CrossRefGoogle Scholar
Rosenbaum, P. (1987). Comparing item characteristic curves. Psychometrika, 52, 217233.CrossRefGoogle Scholar
Standards for educational and psychological tests (1985). Washington, DC: A joint publication of the American Educational Research Association, the American Psychological Association, and the National Council on Measurement in Education.Google Scholar
Uniform guidelines on employee selection procedures (1978). United States Federal Register, 43, 3829638369.Google Scholar