Published online by Cambridge University Press: 01 January 2025
Standard procedures for drawing inferences from complex samples do not apply when the variable of interest θ cannot be observed directly, but must be inferred from the values of secondary random variables that depend on θ stochastically. Examples are proficiency variables in item response models and class memberships in latent class models. Rubin's “multiple imputation” techniques yield approximations of sample statistics that would have been obtained, had θ been observable, and associated variance estimates that account for uncertainty due to both the sampling of respondents and the latent nature of θ. The approach is illustrated with data from the National Assessment for Educational Progress.
This research was supported by Grant No. NIE-G-83-0011 of the Office for Educational Research and Improvement, Center for Education Statistics, and Contract No. N00014-88-K-0304, R&T 4421552 from the Cognitive Sciences Program, Cognitive and Neural Sciences Division, Office of Naval Research. It does not necessarily reflect the views of either agency. I am grateful to R. Darrell Bock for calling my attention to the applicability of multiple imputation to the assessment setting; to Albert Beaton and Eugene Johnson for enlightening discussions on the topic; and to Henry Braun, Ben King, Debra Kline, Gary Phillips, Paul Rosenbaum, Don Rubin, John Tukey, Ming-Mei Wang, Kentaro Yamamoto, Rebecca Zwick, and two anonymous reviewers for comments on earlier drafts. Example 4 is based on the analysis of the 1984 National Assessment for Educational Progress reading survey, carried out at Educational Testing Service through the tireless efforts of too many people to mention by name, under the direction of Albert Beaton, Director of NAEP Data Analyses. David Freund, Bruce Kaplan, and Jennifer Nelson conducted additional analyses of the 1984 and 1988 data for the example.