Hostname: page-component-745bb68f8f-s22k5 Total loading time: 0 Render date: 2025-01-07T19:26:04.008Z Has data issue: false hasContentIssue false

Investigating the Performance of Alternate Regression Weights by Studying All Possible Criteria in Regression Models with a Fixed Set of Predictors

Published online by Cambridge University Press:  01 January 2025

Niels Waller*
Affiliation:
University of Minnesota
Jeff Jones
Affiliation:
University of Minnesota
*
Requests for reprints should be sent to Niels Waller, Department of Psychology, University of Minnesota, N218 Elliott Hall, Minneapolis, MN 55455, USA. E-mail: nwaller@umn.edu

Abstract

We describe methods for assessing all possible criteria (i.e., dependent variables) and subsets of criteria for regression models with a fixed set of predictors, x (where x is an n×1 vector of independent variables). Our methods build upon the geometry of regression coefficients (hereafter called regression weights) in n-dimensional space. For a full-rank predictor correlation matrix, Rxx, of order n, and for regression models with constant R2 (coefficient of determination), the OLS weight vectors for all possible criteria terminate on the surface of an n-dimensional ellipsoid. The population performance of alternate regression weights—such as equal weights, correlation weights, or rounded weights—can be modeled as a function of the Cartesian coordinates of the ellipsoid. These geometrical notions can be easily extended to assess the sampling performance of alternate regression weights in models with either fixed or random predictors and for models with any value of R2. To illustrate these ideas, we describe algorithms and R (R Development Core Team, 2009) code for: (1) generating points that are uniformly distributed on the surface of an n-dimensional ellipsoid, (2) populating the set of regression (weight) vectors that define an elliptical arc in ℝn, and (3) populating the set of regression vectors that have constant cosine with a target vector in ℝn. Each algorithm is illustrated with real data. The examples demonstrate the usefulness of studying all possible criteria when evaluating alternate regression weights in regression models with a fixed set of predictors.

Type
Original Paper
Copyright
Copyright © 2011 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Ackerman, P.L., Heggestad, E.D. (1997). Intelligence, personality, and interests: Evidence for overlapping traits. Psychological Bulletin, 121, 219245.CrossRefGoogle ScholarPubMed
Adams, C., Thompson, A., Hass, J. (2001). How to ace the rest of calculus: the streetwise guide, including multivariable calculus, New York: Freeman.Google Scholar
Bentler, P. (1980). Multivariate analysis with latent variables: causal modeling. Annual Review of Psychology, 31, 419456.CrossRefGoogle Scholar
Bertrand, P.V., Holder, R.L. (1988). A quirk in multiple regression: the whole regression can be greater than the sum of its parts. The Statistician, 37, 371374.CrossRefGoogle Scholar
Borovkov, K. (1994). On simulation of random vectors with given densities in regions and on their boundaries. Journal of Applied Probability, 31, 205220.CrossRefGoogle Scholar
Briel, J.B., O’Neill, K., Scheunernan, J.D. (1993). GRE Technical Manual, Princeton: Educational Testing Service.Google Scholar
Campbell, J. (1974). A Monte Carlo approach to some problems inherent in multivariate prediction: with special reference to multiple regression. Personnel training research program technical report 2002, Washington: Office of Navel Research.Google Scholar
Chen, T., Glotzer, S.C. (2007). Simulation studies of a phenomenological model for elongated virus capsid formation. Physical Review E, 75, 17.CrossRefGoogle ScholarPubMed
Cheng, C.L., Van Ness, J.W. (1999). Statistical regression with measurement error, London: Arnold.Google Scholar
Claudy, J.G. (1972). A comparison of five variable weighting procedures. Educational and Psychological Measurement, 32, 31322.CrossRefGoogle Scholar
Conger, A.J. (1974). A revised definition for suppressor variables: a guide to their identification and interpretation. Educational and Psychological Measurement, 34, 3546.CrossRefGoogle Scholar
Cronbach, L.J., Meehl, P.E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281302.CrossRefGoogle ScholarPubMed
Dana, J., Dawes, R.M. (2004). The superiority of simple alternatives to regression for social science predictions. Journal of Educational and Behavioral Statistics, 29, 317331.CrossRefGoogle Scholar
Davis-Stober, C.P., Dana, J., Budescu, D.V. (2010). A constrained linear estimator for multiple regression. Psychometrika, 75, 521541.CrossRefGoogle Scholar
Derksen, S., Keselman, H. (1992). Backward, forward and stepwise automated subset selection algorithms: frequency of obtaining authentic and noise variables. British Journal of Mathematical & Statistical Psychology, 45, 265282.CrossRefGoogle Scholar
Dorans, N., Drasgow, F. (1978). Alternative weight schemes for linear prediction. Organization Behavior and Human Performance, 21, 316345.CrossRefGoogle Scholar
Dunnette, M.D., Borman, W.C. (1979). Personnel selection and classification systems. Annual Review of Psychology, 30, 477525.CrossRefGoogle Scholar
Einhorn, H.J., Hogarth, R.M. (1975). Unit weighting schemes for decision making. Organizational Behavior and Human Performance, 13, 171192.CrossRefGoogle Scholar
Fishman, G.S. (1996). Monte Carlo: concepts, algorithms, and applications, New York: Springer.CrossRefGoogle Scholar
Gignac, G.E., Stough, C., Loukomitis, S. (2004). Openness, intelligence, and self-report intelligence. Intelligence, 32, 133143.CrossRefGoogle Scholar
Green, B.F. (1977). Parameter sensitivity in multivariate methods. Multivariate Behavioral Research, 12, 263287.CrossRefGoogle ScholarPubMed
Grove, W.M., Meehl, P.E. (1996). Comparative efficiency of informal (subjective, impressionistic) and formal (mechanical, algorithmic) prediction procedures: The clinical-statistical controversy. Psychology. Public Policy and Law, 2, 293323.CrossRefGoogle Scholar
Hacking, I. (1971). Jacques Bernoulli’s art of conjecturing. British Journal for the Philosophy of Science, 22, 209229.CrossRefGoogle Scholar
Hadi, A. (1996). Matrix algebra as a tool, Belmont: Duxbury Press.Google Scholar
Hamilton, D. (1987). Sometimes \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$R^{2}\gt r_{yx_{1}}^{2}+r_{yx_{2}}^{2}$\end{document}: correlated variables are not always redundant. The American Statistician, 41, 129132.Google Scholar
Horst, P. (1941). The role of prediction variables which are independent of the criterion. The prediction of personal adjustment, 431436.Google Scholar
Hunter, J.E., Hunter, R.F. (1984). Validity and utility of alternative predictors of job performance. Psychological Bulletin, 96, 7298.CrossRefGoogle Scholar
Igathinathane, C., Chattopadhyay, P. (1998). Numerical techniques for estimating the surface areas of ellipsoids representing food materials. Journal of Agricultural Engineering Research, 70, 313322.CrossRefGoogle Scholar
Keller, S. (1979). On the surface area of the ellipsoid. Mathematics of Computation, 80, 310314.CrossRefGoogle Scholar
Keren, G., Newman, J.R. (1978). Additional considerations with regard to multiple regression and equal weighting. Organizational Behavior and Human Performance, 22, 143164.CrossRefGoogle Scholar
Keynes, J.M. (1921). A treatise on probability, London: Macmillan.Google Scholar
Kronmal, R., Peterson, A. Jr (1981). A variant of the acceptance-rejection method for computer generation of random variables. Journal of the American Statistical Association, 76, 446451.CrossRefGoogle Scholar
Kuncel, N.R., Hezlett, S.A., Ones, D.S. (2001). A comprehensive meta-analysis of the predictive validity of the graduate record examinations: implications for graduate student selection and performance. Psychological Bulletin, 127, 162181.CrossRefGoogle Scholar
Laughlin, J.E. (1978). Comment on “Estimating coefficients in linear models. It don’t make no nevermind”. Psychological Bulletin, 85, 247253.CrossRefGoogle Scholar
Legendre, A. Sur les intégrales doubles, 1788.Google Scholar
Legendre, A. (1825). Traité des fonctions elliptiques et des intégrales euleriennes. Tome 1, Paris: Huzard-Courchier.Google Scholar
Loevinger, J. (1957). Objective tests as instruments of psychological theory. Psychological Reports, 3, 635694.Google Scholar
Marsaglia, G. (1972). Choosing a point from the surface of a sphere. Annals of Mathematical Statistics, 43, 645646.CrossRefGoogle Scholar
McCrae, R.R., Costa, P.T. (1997). Conceptions and correlates of openness to experience. In Hogan, R., Johnson, J., Briggs, S. (Eds.), Handbook of personality (pp. 825847). San Diego: Academic Press.CrossRefGoogle Scholar
Meehl, P.E. (1954). Clinical versus statistical prediction: A theoretical analysis and a review of the evidence, Minneapolis: University of Minnesota Press.CrossRefGoogle Scholar
Muller, M.E. (1959). A note on a method for generating points uniformly on n-dimensional spheres. Communications of the ACM, 2, 1920.CrossRefGoogle Scholar
Pruzek, R.M., Fredrick, B.C. (1978). Weighting procedures in linear models: alternatives to least squares and limitations of equal weights. Psychological Bulletin, 85, 254266.CrossRefGoogle Scholar
R Development Core Team (2009). R: a language and environment for statistical computing, Vienna: R Foundation for Statistical Computing URL http://www.R-project.org.Google Scholar
Raju, N.S., Bilgic, R., Edwards, J.E., Fleer, P.F. (1997). Methodology review: estimation of population validity and cross-validity, and the use of equal weights in prediction. Applied Psychological Measurement, 21, 291305.CrossRefGoogle Scholar
Raju, N.S., Bilgic, R., Edwards, J.E., Fleer, P.F. (1999). Accuracy of population validity and cross-validity estimation: an empirical comparison of formula-based, traditional empirical, and equal weights procedures. Applied Psychological Measurement, 23, 99115.CrossRefGoogle Scholar
Ree, M.J., Carretta, T.R., Earles, J.A. (1998). In top-down decisions, weighting variables does not matter: a consequence of Wilk’s theorem. Organizational Research Methods, 1, 407420.CrossRefGoogle Scholar
Rubin, H. (2007). Note retrieved from the Internet site: http://groups.google.com/group/sci.math/msg/24356105b2f3bc41?Hal=en. Retrieved September 15, 2009.Google Scholar
Rubinstein, R. (1982). Generating random vectors uniformly distributed inside and on the surface of different regions. European Journal of Operations Research, 10, 205209.CrossRefGoogle Scholar
Rubinstein, R.Y. (1986). Monte Carlo optimization, simulation and sensitivity of queueing networks, New York: Wiley.Google Scholar
Rubinstein, R.Y., Kroese, D.P. (2007). Simulation and the Monte Carlo method, Hoboken: Wiley-Interscience.CrossRefGoogle Scholar
Schmidt, F.L. (1971). The relative efficiency of regression and simple unit predictor weights in applied differential psychology. Educational and Psychological Measurement, 31, 699714.CrossRefGoogle Scholar
Stewart, J. (2003). Calculus: early transcendentals, Belmont: Wadsworth Publishing.Google Scholar
Tatsuoka, M.M. (1988). Multivariate analysis, (2nd ed.). New York: Wiley.Google Scholar
Tee, G. (2005). Surface area and capacity of ellipsoids in n dimensions. New Zealand Journal of Mathematics, 34, 165198.Google Scholar
Tee, G. (2006). Surface area and surface integrals on ellipsoid segments. Unpublished manuscript.Google Scholar
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 267–288.CrossRefGoogle Scholar
Tulsky, D., Zhu, J., Ledbetter, M. (1997). WAIS-III: WMS-III technical manual, San Antonio: Harcourt Brace.Google Scholar
Von Neumann, J. (1951). Various techniques used in connection with random digits. Monte Carlo methods. National Bureau of Standards. Applied Mathematics Series, 12, 3638.Google Scholar
Wainer, H.H. (1976). Estimating coefficients in linear models: It don’t make no nevermind. Psychological Bulletin, 83, 312317.CrossRefGoogle Scholar
Wainer, H.H. (1978). On the sensitivity of regression and regressors. Psychological Bulletin, 85, 267273.CrossRefGoogle Scholar
Waller, N., Jones, J. (2010). Correlation weights in multiple regression. Psychometrika, 75, 5869.CrossRefGoogle Scholar
Williamson, J.F. (1987). Random selection of points distributed on curved surfaces. Physics in Medicine and Biology, 32, 13111319.CrossRefGoogle ScholarPubMed