Hostname: page-component-745bb68f8f-grxwn Total loading time: 0 Render date: 2025-01-07T18:36:33.544Z Has data issue: false hasContentIssue false

Omitted Variables in Multilevel Models

Published online by Cambridge University Press:  01 January 2025

Jee-Seon Kim*
Affiliation:
University of Wisconsin, Madison
Edward W. Frees
Affiliation:
University of Wisconsin, Madison
*
Requests for reprints should be sent to Jee-Seon Kim, Department of Educational Psychology, University of Wisconsin, 1025 West Johnson Street, Madison, WI 53706, USA. E-mail: jeeseonkim@wisc.edu

Abstract

Statistical methodology for handling omitted variables is presented in a multilevel modeling framework. In many nonexperimental studies, the analyst may not have access to all requisite variables, and this omission may lead to biased estimates of model parameters. By exploiting the hierarchical nature of multilevel data, a battery of statistical tools are developed to test various forms of model misspecification as well as to obtain estimators that are robust to the presence of omitted variables. The methodology allows for tests of omitted effects at single and multiple levels. The paper also introduces intermediate-level tests; these are tests for omitted effects at a single level, regardless of the presence of omitted effects at a higher level. A simulation study shows, not surprisingly, that the omission of variables yields bias in both regression coefficients and variance components; it also suggests that omitted effects at lower levels may cause more severe bias than at higher levels. Important factors resulting in bias were found to be the level of an omitted variable, its effect size, and sample size. A real data study illustrates that an omitted variable at one level may yield biased estimators at any level and, in this study, one cannot obtain reliable estimates for school-level variables when omitted child effects exist. However, robust estimators may provide unbiased estimates for effects of interest even when the efficient estimators fail, and the one-degree-of-freedom test helps one to understand where the problem is located. It is argued that multilevel data typically contain rich information to deal with omitted variables, offering yet another appealing reason for the use of multilevel models in the social sciences.

Type
Original Paper
Copyright
Copyright © 2007 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

This research was supported by the National Academy of Education/Spencer Foundation and the National Science Foundation, Grant Number SES-0436274.

References

Ahn, S.C., Lee, Y.H., Schmidt, P. (2001). GMM estimation of linear panel data models with time-varying individual effects. Journal of Econometrics, 101, 219255.CrossRefGoogle Scholar
Anderson, G.E., Jimerson, S.R., & Whipple, A.D. (2002). Grade retention: Achievement and mental health outcomes. National Association of School Psychologists. Available at http://www.nasponline.org/pdf/graderetention.pdf..Google Scholar
Arellano, M. (1993). On the testing of correlated effects with panel data. Journal of Econometrics, 59, 8797.CrossRefGoogle Scholar
Blundell, R., Windmeijer, F. (1997). Cluster effects and simultaneity in multilevel models. Health Economics, 6, 439443.3.0.CO;2-B>CrossRefGoogle ScholarPubMed
Boardman, A.E., Murnane, R.J. (1979). Using panel data to improve estimates of the determinants of educational achievement. Sociology of Education, 52, 113121.CrossRefGoogle Scholar
Bonesrø nning, H. (2004). Can effective teacher behavior be identified. Economics of Education Review, 23, 237247.CrossRefGoogle Scholar
Chamberlain, G. (1978). Omitted variable bias in panel data: Estimating the returns to schooling. Annales de l’INSEE, 30–1, 4982.Google Scholar
Chamberlain, G. (1985). Heterogeneity, omitted variable bias, duration dependence. In Heckman, J.J., Singer, B. (Eds.), Longitudinal analysis of labor market data, Cambridge, UK: Cambridge University Press.Google Scholar
Coleman, J.S., Campbell, E.Q., Hobson, C.J., McPartland, J., Mood, A.M., Weinfeld, F.D.et al. (1966). Equality of educational opportunity, Washington, DC: US Government Printing Office.Google Scholar
Dee, T.S. (1998). Competition and the quality of public schools. Economics of Educational Review, 17, 419427.CrossRefGoogle Scholar
Diggle, P.J., Heagarty, P., Liang, K.-Y., Zeger, S.L. (2002). Analysis of longitudinal data 2nd ed.,, London: Oxford University Press.CrossRefGoogle Scholar
Dunn, M.C., Kadane, J.B., Garrow, J.R. (2003). Comparing harm done by mobility and class absence: Missing students and missing data. Journal of Educational and Behavioral Statistics, 28, 269288.CrossRefGoogle Scholar
Ebbes, P., Bockenholt, U., Wedel, M. (2004). Regressor and random-effects dependencies in multilevel models. Statistica Neerlandica, 58, 161178.CrossRefGoogle Scholar
Ehrenberg, R.G., Brewer, D.J. (1994). Do school and teacher characteristics matter? Evidence from High School and Beyond. Economics of Education Review, 13, 117.CrossRefGoogle Scholar
Ehrenberg, R.G., Brewer, D.J. (1995). Did teachers verbal-ability and race matter in the 1960s—Coleman revisited. Economics of Educational Review, 14, 121.CrossRefGoogle Scholar
Ehrenberg, R.G., Brewer, D.J., Gamoran, A., Willms, J.D. (2001). Class size and student achievement. Psychological Science in the Public Interest, 2, 130.CrossRefGoogle ScholarPubMed
Ehrenberg, R.G., Goldhaber, D.D., Brewer, D.J. (1995). Do teachers’ race, gender, and ethnicity matter? Evidence from NELS:88. Industrial and Labor Relations Review, 48, 547561.CrossRefGoogle Scholar
Frank, K.A. (2000). Impact of a confounding variable on a regression coefficient. Sociological Methods & Research, 29, 147194.CrossRefGoogle Scholar
Frees, E.W. (2001). Omitted variables in longitudinal data models. The Canadian Journal of Statistics, 29, 573595.CrossRefGoogle Scholar
Frees, E.W. (2004). Longitudinal and panel data: Analysis and applications for the social sciences, Cambridge, UK: Cambridge University Press.CrossRefGoogle Scholar
Frees, E.W., Kim, J.-S. (2006). Multilevel model prediction. Psychometrika, 71, 79104.CrossRefGoogle Scholar
Goldhaber, D.D., Brewer, D.J. (1997). Why don’t schools and teachers seem to matter? Assessing the impact of unobservables on educational productivity. The Journal of Human Resources, 32, 505523.CrossRefGoogle Scholar
Goldstein, H. (2003). Multilevel statistical models 3rd ed.,, London: Oxford University Press.Google Scholar
Griliches, Z. (1977). Estimating the returns to schooling. Econometrica, 45, 122.CrossRefGoogle Scholar
Halaby, C.H. (2004). Panel models in sociological research: Theory into practice. Annual Review of Sociology, 30, 507540.CrossRefGoogle Scholar
Hanushek, E.A. (2003). The failure of input-based schooling policies. The Economic Journal, 113, 6498.CrossRefGoogle Scholar
Hanushek, E.A., Kane, J.F., Rivkin, S.G. (2004). Disruption versus Tiebout improvement: The costs and benefits of switching schools. Journal of Public Econometrics, 88, 17211746.CrossRefGoogle Scholar
Hausman, J.A. (1978). Specification tests in econometrics. Econometrica, 46, 12511272.CrossRefGoogle Scholar
Hausman, J.A., Taylor, W.E. (1981). Panel data and unobservable individual effects. Econometrica, 49, 13771398.CrossRefGoogle Scholar
Heckman, J.J., Singer, B. (1982). Population heterogeneity in demographic models. In Land, K., Rogers, A. (Eds.), Multidimensional mathematical demography, New York: Academic Press.Google Scholar
Hedges, L., Laine, R., Greenwald, R. (1994). Does money matter? A meta analysis of the effects of differential school inputs on student outcomes. Educational Research, 23, 514.CrossRefGoogle Scholar
Hsiao, C. (2003). Analysis of panel data 2nd ed.,, Cambridge, UK: Cambridge University Press.CrossRefGoogle Scholar
Kiefer, N.M. (1980). Estimation of fixed effects models for time series of cross sections with arbitrary intertemporal covariance. Journal of Econometrics, 14, 195202.CrossRefGoogle Scholar
Kim, J.-S., & Frees, E.W. (2005). Fixed effects estimation in multilevel models. University of Wisconsin working paper, available at http://research.bus.wisc.edu/jfrees/.Google Scholar
Laird, N. (2004). Analysis of longitudinal and cluster-correlated data, Beachwood, OH: Institute of Mathematical Statistics.CrossRefGoogle Scholar
Ludwig, J., Bassi, L.J. (1999). The puzzling case of school resources and student achievement. Educational Evaluation and Policy Analysis, 21, 385403.CrossRefGoogle Scholar
Maas, C.J., Hox, J.J. (2004). Robustness issues in multilevel regression analysis. Statistica Neerlandica, 58, 127137.CrossRefGoogle Scholar
Maddala, G.S. (1971). The use of variance components models in pooling cross section and time series data. Econometrica, 39, 341358.CrossRefGoogle Scholar
Marsh, L.C. (2004). The econometrics of higher education: Editor’s view. Journal of Econometrics, 121, 118.CrossRefGoogle Scholar
McCaffrey, D.F., Koretz, D., Louis, T.A., Hamilton, L. (2004). Models for value-added modeling of teacher effects. Journal of Educational and Behavioral Statistics, 29, 67101.CrossRefGoogle ScholarPubMed
Murnane, R.J., Phillips, B.R. (1981). What do effective teachers of inner-city children have in common. Social Science Research, 10, 83100.CrossRefGoogle Scholar
National Association of School Psychologists (NASP) (2003). Position statement on student grade retention and school promotion. Available at http://www.nasponline.org/information/pospaper_graderetent.html.Google Scholar
Palta, M., Yao, T.-J. (1991). Analysis of longitudinal data with unmeasured confounders. Biometrics, 47, 13551369.CrossRefGoogle ScholarPubMed
Phillips, M. (1997). What makes schools effective. A comparison of the relationships of communitarian climate and academic climate to mathematics achievement and attendance during middle school. American Educational Research Journal, 34, 633662.CrossRefGoogle Scholar
Raudenbush, S.W., Bryk, A.S. (2002). Hierarchical linear models: Applications and data analysis methods 2nd ed.,, Newbury Park, CA: Sage.Google Scholar
Raudenbush, S.W., Willms, J.D. (1995). The estimation of school effects. Journal of Educational and Behavioral Statistics, 20, 307335.CrossRefGoogle Scholar
Rice, N., Jones, A., Goldstein, H. (1998). Multilevel models where the random effects are correlated with the fixed predictors: A conditioned iterative generalised least squares estimator (CIGLS), York: University of York, Centre for Health Economics.Google Scholar
Rivkin, S.G., Hanushek, E.A., Kain, J.F. (2005). Teachers, schools, and academic achievement. Econometrica, 73, 417458.CrossRefGoogle Scholar
Singer, J. (1998). Using SAS PROC MIXED to fit multilevel models, hierarchical models, and individual growth models. Journal of Educational and Behavioral Statistics, 24, 323355.CrossRefGoogle Scholar
Snijders, T.A.B., Bosker, R.J. (1999). Multilevel analysis: An introduction to basic and advanced multilevel modeling, London: Sage.Google Scholar
Verbeke, G., Spiessens, B., Lesaffre, E. (2001). Conditional linear mixed models. The American Statistician, 55, 2534.CrossRefGoogle Scholar
Vermunt, J.K. (1997). Log-linear models for event histories, Thousand Oaks, CA: Sage.Google Scholar
Webb, N.L., Clune, W.H., Bolt, D.M., Gamoran, A., Meyer, R.H., Osthoff, E., Thorn, C. (2002). Models for analysis of NSF’s systemic initiative programs—The impact of the urban system initiatives on student achievement in Texas, 1994–000, Technical Report. Madison, WI: Wisconsin Center for Education Research.Google Scholar
Wooldridge, J.M. (2002). Econometric analysis of cross section and panel data, Cambridge, MA: MIT Press.Google Scholar
Yamaguchi, K. (1986). Alternative approaches to unobserved heterogeneity in the analysis of repeatable events. In Tuma, B. (Eds.), Sociological methodology (pp. 213249). Washington, DC: American Sociological Association.Google Scholar