The U.S. National Assessment of Educational Progress (NAEP), the Third International Mathematics and Science Study (TIMSS), and the U.S. Adult Literacy Survey collect probability samples of students (or adults) who are administered brief examinations in subject areas such as mathematics and reading (cognitive variables), along with background demographic (primary) and educational environment (secondary) questions. The demographic questions are used in the primary reporting, while the numerous “explanatory” secondary variables, or “covariates”, are only directly utilized in subsequent secondary analyses. The covariates are also used indirectly to create the plausible values (multiple imputations) that are an integral part of analyses because of the use of sparse matrix sampling of cognitive items. The improvement in the precision of the primary reporting due to the inclusion of the covariates is assessed here and contrasted with the precision of reporting using plausible values created using only the primary demographic variables.
The results demonstrate that the improvement in precision depends on the matrix sampling designs for the cognitive assessments. The improvements range from essentially none for the most common designs, to moderate for some less common designs. Consequently, two potential changes in the reporting procedures that could improve the statistical and operational efficiency of primary reporting are (a) eliminate or reduce the collection of covariates and increase the number of cognitive items, (b) to avoid delays, eliminate the covariates from the creation of plausible values used for the primary reports, but include them later when creating public-use files for secondary analyses. The potential improvements in statistical and operational efficiency must be weighed against the intrinsic interest in the covariates, and the potential for small discrepancies in the primary and secondary reporting.