Investigating the Performance of Alternate Regression Weights by Studying All Possible Criteria in Regression Models with a Fixed Set of Predictors

Niels Waller; Jeff Jones

doi:10.1007/s11336-011-9209-5

Investigating the Performance of Alternate Regression Weights by Studying All Possible Criteria in Regression Models with a Fixed Set of Predictors

Published online by Cambridge University Press: 01 January 2025

Niels Waller and

Jeff Jones

Show author details

Niels Waller*: Affiliation:
University of Minnesota
Jeff Jones: Affiliation:
University of Minnesota
*: Requests for reprints should be sent to Niels Waller, Department of Psychology, University of Minnesota, N218 Elliott Hall, Minneapolis, MN 55455, USA. E-mail: nwaller@umn.edu

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

We describe methods for assessing all possible criteria (i.e., dependent variables) and subsets of criteria for regression models with a fixed set of predictors, x (where x is an n×1 vector of independent variables). Our methods build upon the geometry of regression coefficients (hereafter called regression weights) in n-dimensional space. For a full-rank predictor correlation matrix, Rxx, of order n, and for regression models with constant R2 (coefficient of determination), the OLS weight vectors for all possible criteria terminate on the surface of an n-dimensional ellipsoid. The population performance of alternate regression weights—such as equal weights, correlation weights, or rounded weights—can be modeled as a function of the Cartesian coordinates of the ellipsoid. These geometrical notions can be easily extended to assess the sampling performance of alternate regression weights in models with either fixed or random predictors and for models with any value of R2. To illustrate these ideas, we describe algorithms and R (R Development Core Team, 2009) code for: (1) generating points that are uniformly distributed on the surface of an n-dimensional ellipsoid, (2) populating the set of regression (weight) vectors that define an elliptical arc in ℝn, and (3) populating the set of regression vectors that have constant cosine with a target vector in ℝn. Each algorithm is illustrated with real data. The examples demonstrate the usefulness of studying all possible criteria when evaluating alternate regression weights in regression models with a fixed set of predictors.

Keywords

Monte Carlo multiple regression weighting

Information

Type: Original Paper
Information: Psychometrika , Volume 76 , Issue 3 , July 2011 , pp. 410 - 439

DOI: https://doi.org/10.1007/s11336-011-9209-5 [Opens in a new window]
Copyright: Copyright © 2011 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Ackerman, P.L., Heggestad, E.D. (1997). Intelligence, personality, and interests: Evidence for overlapping traits. Psychological Bulletin, 121, 219–245.CrossRef Google Scholar PubMed

Adams, C., Thompson, A., Hass, J. (2001). How to ace the rest of calculus: the streetwise guide, including multivariable calculus, New York: Freeman.Google Scholar

Bentler, P. (1980). Multivariate analysis with latent variables: causal modeling. Annual Review of Psychology, 31, 419–456.CrossRef Google Scholar

Bertrand, P.V., Holder, R.L. (1988). A quirk in multiple regression: the whole regression can be greater than the sum of its parts. The Statistician, 37, 371–374.CrossRef Google Scholar

Borovkov, K. (1994). On simulation of random vectors with given densities in regions and on their boundaries. Journal of Applied Probability, 31, 205–220.CrossRef Google Scholar

Briel, J.B., O’Neill, K., Scheunernan, J.D. (1993). GRE Technical Manual, Princeton: Educational Testing Service.Google Scholar

Campbell, J. (1974). A Monte Carlo approach to some problems inherent in multivariate prediction: with special reference to multiple regression. Personnel training research program technical report 2002, Washington: Office of Navel Research.Google Scholar

Chen, T., Glotzer, S.C. (2007). Simulation studies of a phenomenological model for elongated virus capsid formation. Physical Review E, 75, 1–7.CrossRef Google Scholar PubMed

Cheng, C.L., Van Ness, J.W. (1999). Statistical regression with measurement error, London: Arnold.Google Scholar

Claudy, J.G. (1972). A comparison of five variable weighting procedures. Educational and Psychological Measurement, 32, 31–322.CrossRef Google Scholar

Conger, A.J. (1974). A revised definition for suppressor variables: a guide to their identification and interpretation. Educational and Psychological Measurement, 34, 35–46.CrossRef Google Scholar

Cronbach, L.J., Meehl, P.E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281–302.CrossRef Google Scholar PubMed

Dana, J., Dawes, R.M. (2004). The superiority of simple alternatives to regression for social science predictions. Journal of Educational and Behavioral Statistics, 29, 317–331.CrossRef Google Scholar

Davis-Stober, C.P., Dana, J., Budescu, D.V. (2010). A constrained linear estimator for multiple regression. Psychometrika, 75, 521–541.CrossRef Google Scholar

Derksen, S., Keselman, H. (1992). Backward, forward and stepwise automated subset selection algorithms: frequency of obtaining authentic and noise variables. British Journal of Mathematical & Statistical Psychology, 45, 265–282.CrossRef Google Scholar

Dorans, N., Drasgow, F. (1978). Alternative weight schemes for linear prediction. Organization Behavior and Human Performance, 21, 316–345.CrossRef Google Scholar

Dunnette, M.D., Borman, W.C. (1979). Personnel selection and classification systems. Annual Review of Psychology, 30, 477–525.CrossRef Google Scholar

Einhorn, H.J., Hogarth, R.M. (1975). Unit weighting schemes for decision making. Organizational Behavior and Human Performance, 13, 171–192.CrossRef Google Scholar

Fishman, G.S. (1996). Monte Carlo: concepts, algorithms, and applications, New York: Springer.CrossRef Google Scholar

Gignac, G.E., Stough, C., Loukomitis, S. (2004). Openness, intelligence, and self-report intelligence. Intelligence, 32, 133–143.CrossRef Google Scholar

Green, B.F. (1977). Parameter sensitivity in multivariate methods. Multivariate Behavioral Research, 12, 263–287.CrossRef Google Scholar PubMed

Grove, W.M., Meehl, P.E. (1996). Comparative efficiency of informal (subjective, impressionistic) and formal (mechanical, algorithmic) prediction procedures: The clinical-statistical controversy. Psychology. Public Policy and Law, 2, 293–323.CrossRef Google Scholar

Hacking, I. (1971). Jacques Bernoulli’s art of conjecturing. British Journal for the Philosophy of Science, 22, 209–229.CrossRef Google Scholar

Hadi, A. (1996). Matrix algebra as a tool, Belmont: Duxbury Press.Google Scholar

Hamilton, D. (1987). Sometimes \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$R^{2}\gt r_{yx_{1}}^{2}+r_{yx_{2}}^{2}$\end{document}

: correlated variables are not always redundant. The American Statistician, 41, 129–132.Google Scholar

Horst, P. (1941). The role of prediction variables which are independent of the criterion. The prediction of personal adjustment, 431–436.Google Scholar

Hunter, J.E., Hunter, R.F. (1984). Validity and utility of alternative predictors of job performance. Psychological Bulletin, 96, 72–98.CrossRef Google Scholar

Igathinathane, C., Chattopadhyay, P. (1998). Numerical techniques for estimating the surface areas of ellipsoids representing food materials. Journal of Agricultural Engineering Research, 70, 313–322.CrossRef Google Scholar

Keller, S. (1979). On the surface area of the ellipsoid. Mathematics of Computation, 80, 310–314.CrossRef Google Scholar

Keren, G., Newman, J.R. (1978). Additional considerations with regard to multiple regression and equal weighting. Organizational Behavior and Human Performance, 22, 143–164.CrossRef Google Scholar

Keynes, J.M. (1921). A treatise on probability, London: Macmillan.Google Scholar

Kronmal, R., Peterson, A. Jr (1981). A variant of the acceptance-rejection method for computer generation of random variables. Journal of the American Statistical Association, 76, 446–451.CrossRef Google Scholar

Kuncel, N.R., Hezlett, S.A., Ones, D.S. (2001). A comprehensive meta-analysis of the predictive validity of the graduate record examinations: implications for graduate student selection and performance. Psychological Bulletin, 127, 162–181.CrossRef Google Scholar

Laughlin, J.E. (1978). Comment on “Estimating coefficients in linear models. It don’t make no nevermind”. Psychological Bulletin, 85, 247–253.CrossRef Google Scholar

Legendre, A. Sur les intégrales doubles, 1788.Google Scholar

Legendre, A. (1825). Traité des fonctions elliptiques et des intégrales euleriennes. Tome 1, Paris: Huzard-Courchier.Google Scholar

Loevinger, J. (1957). Objective tests as instruments of psychological theory. Psychological Reports, 3, 635–694.Google Scholar

Marsaglia, G. (1972). Choosing a point from the surface of a sphere. Annals of Mathematical Statistics, 43, 645–646.CrossRef Google Scholar

McCrae, R.R., Costa, P.T. (1997). Conceptions and correlates of openness to experience. In Hogan, R., Johnson, J., Briggs, S. (Eds.), Handbook of personality (pp. 825–847). San Diego: Academic Press.CrossRef Google Scholar

Meehl, P.E. (1954). Clinical versus statistical prediction: A theoretical analysis and a review of the evidence, Minneapolis: University of Minnesota Press.Google Scholar

Muller, M.E. (1959). A note on a method for generating points uniformly on n-dimensional spheres. Communications of the ACM, 2, 19–20.CrossRef Google Scholar

Pruzek, R.M., Fredrick, B.C. (1978). Weighting procedures in linear models: alternatives to least squares and limitations of equal weights. Psychological Bulletin, 85, 254–266.CrossRef Google Scholar

R Development Core Team (2009). R: a language and environment for statistical computing, Vienna: R Foundation for Statistical Computing URL http://www.R-project.org.Google Scholar

Raju, N.S., Bilgic, R., Edwards, J.E., Fleer, P.F. (1997). Methodology review: estimation of population validity and cross-validity, and the use of equal weights in prediction. Applied Psychological Measurement, 21, 291–305.CrossRef Google Scholar

Raju, N.S., Bilgic, R., Edwards, J.E., Fleer, P.F. (1999). Accuracy of population validity and cross-validity estimation: an empirical comparison of formula-based, traditional empirical, and equal weights procedures. Applied Psychological Measurement, 23, 99–115.CrossRef Google Scholar

Ree, M.J., Carretta, T.R., Earles, J.A. (1998). In top-down decisions, weighting variables does not matter: a consequence of Wilk’s theorem. Organizational Research Methods, 1, 407–420.CrossRef Google Scholar

Rubin, H. (2007). Note retrieved from the Internet site: http://groups.google.com/group/sci.math/msg/24356105b2f3bc41?Hal=en. Retrieved September 15, 2009.Google Scholar

Rubinstein, R. (1982). Generating random vectors uniformly distributed inside and on the surface of different regions. European Journal of Operations Research, 10, 205–209.CrossRef Google Scholar

Rubinstein, R.Y. (1986). Monte Carlo optimization, simulation and sensitivity of queueing networks, New York: Wiley.Google Scholar

Rubinstein, R.Y., Kroese, D.P. (2007). Simulation and the Monte Carlo method, Hoboken: Wiley-Interscience.CrossRef Google Scholar

Schmidt, F.L. (1971). The relative efficiency of regression and simple unit predictor weights in applied differential psychology. Educational and Psychological Measurement, 31, 699–714.CrossRef Google Scholar

Stewart, J. (2003). Calculus: early transcendentals, Belmont: Wadsworth Publishing.Google Scholar

Tatsuoka, M.M. (1988). Multivariate analysis, (2nd ed.). New York: Wiley.Google Scholar

Tee, G. (2005). Surface area and capacity of ellipsoids in n dimensions. New Zealand Journal of Mathematics, 34, 165–198.Google Scholar

Tee, G. (2006). Surface area and surface integrals on ellipsoid segments. Unpublished manuscript.Google Scholar

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 267–288.CrossRef Google Scholar

Tulsky, D., Zhu, J., Ledbetter, M. (1997). WAIS-III: WMS-III technical manual, San Antonio: Harcourt Brace.Google Scholar

Von Neumann, J. (1951). Various techniques used in connection with random digits. Monte Carlo methods. National Bureau of Standards. Applied Mathematics Series, 12, 36–38.Google Scholar

Wainer, H.H. (1976). Estimating coefficients in linear models: It don’t make no nevermind. Psychological Bulletin, 83, 312–317.CrossRef Google Scholar

Wainer, H.H. (1978). On the sensitivity of regression and regressors. Psychological Bulletin, 85, 267–273.CrossRef Google Scholar

Waller, N., Jones, J. (2010). Correlation weights in multiple regression. Psychometrika, 75, 58–69.CrossRef Google Scholar

Williamson, J.F. (1987). Random selection of points distributed on curved surfaces. Physics in Medicine and Biology, 32, 1311–1319.CrossRef Google Scholar PubMed

Article contents

Investigating the Performance of Alternate Regression Weights by Studying All Possible Criteria in Regression Models with a Fixed Set of Predictors

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests