Hostname: page-component-745bb68f8f-g4j75 Total loading time: 0 Render date: 2025-01-07T18:23:20.433Z Has data issue: false hasContentIssue false

Rating Scales as Predictors—The Old Question of Scale Level and Some Answers

Published online by Cambridge University Press:  01 January 2025

Gerhard Tutz*
Affiliation:
Ludwig-Maximilians-Universität Munich
Jan Gertheiss
Affiliation:
Georg-August-Universität Göttingen
*
Requests for reprints should be sent to Gerhard Tutz, Department of Statistics, Ludwig-Maximilians-Universität Munich, Munich, Germany. E-mail: gerhard.tutz@stat.uni-muenchen.de

Abstract

Rating scales as predictors in regression models are typically treated as metrically scaled variables or, alternatively, are coded in dummy variables. The first approach implies a scale level that is not justified, the latter approach results in a large number of parameters to be estimated. Therefore, when rating scales are dummy-coded, applications are often restricted to the use of a few predictors. The penalization approach advocated here takes the scale level serious by using only the ordering of categories but is shown to work in the high dimensional case. We consider the proper modeling of rating scales as predictors and selection procedures by using penalization methods that are tailored to ordinal predictors. In addition to the selection of predictors, the clustering of categories is investigated. Existing methodology is extended to the wider class of generalized linear models. Moreover, higher order differences that allow shrinkage towards a polynomial as well as monotonicity constraints and alternative penalties are introduced. The proposed penalization approaches are illustrated by use of the Motivational States Questionnaire.

Type
Original Paper
Copyright
Copyright © 2013 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Aiken, L. (1996). Rating scales and checklists: evaluating behavior, personality, and attitudes, New York: WileyGoogle Scholar
Albert, A., Anderson, J.A. (1984). On the existence of maximum likelihood estimates in logistic regression models. Biometrika, 71, 110CrossRefGoogle Scholar
Anderson, J.A., Blair, V. (1982). Penalized maximum likelihood estimation in logistic regression and discrimination. Biometrika, 69, 123136CrossRefGoogle Scholar
Bondell, H.D., Reich, B.J. (2009). Simultaneous factor selection and collapsing levels in ANOVA. Biometrics, 65, 169177CrossRefGoogle ScholarPubMed
Breslow, N.E., Clayton, D.G. (1993). Approximate inference in generalized linear mixed model. Journal of the American Statistical Association, 88, 925CrossRefGoogle Scholar
Canty, A., & Ripley, B. (2011). boot: bootstrap R (S-Plus) functions. R package version 1.3-2. Google Scholar
Copas, J.B. (1988). Binary regression models for contaminated data (with discussion). Journal of the Royal Statistical Society. Series B, 50, 225265CrossRefGoogle Scholar
Davison, A.C., Hinkley, D.V. (1997). Bootstrap methods and their application, Cambridge: Cambridge University PressCrossRefGoogle Scholar
Diggle, P.J., Heagerty, P.J., Liang, K.Y., Zeger, S.L. (2002). Analysis of longitudinal data, (2nd ed.). London: Chapman & HallCrossRefGoogle Scholar
Eilers, P. (2005). Unimodal smoothing. Journal of Chemometrics, 19, 317328CrossRefGoogle Scholar
Eilers, P.H.C., Marx, B.D. (1996). Flexible smoothing with B-splines and penalties. Statistical Science, 11, 89121CrossRefGoogle Scholar
Fahrmeir, L., Kneib, T., Lang, S. (2004). Penalized structured additive regression for space-time data: a Bayesian perspective. Statistica Sinica, 14, 731761Google Scholar
Fan, J., Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96, 13481360CrossRefGoogle Scholar
Faraway, J. (2006). Extending the linear model with R, London: Chapman & HallGoogle Scholar
Firth, D. (2003). Overcoming the reference category problem in the presentation of statistical models. Sociological Methodology, 33, 118CrossRefGoogle Scholar
Gelman, A., Jakulin, A., Pittau, M.G., Su, Y.-S. (2008). A weakly informative default prior distribution for logistic and other regression models. Annals of Applied Statistics, 2(4), 13601383CrossRefGoogle Scholar
Gertheiss, J. (2011). ordPens: selection and/or smoothing of ordinal predictors. R package version 0.1-7. Google Scholar
Gertheiss, J., Hogger, S., Oberhauser, C., Tutz, G. (2011). Selection of ordinally scaled independent variables with applications to international classification of functioning core sets. Journal of the Royal Statistical Society. Series C. Applied Statistics, 60, 377396CrossRefGoogle Scholar
Gertheiss, J., Oehrlein, F. (2011). Testing linearity and relevance of ordinal predictors. Electronic Journal of Statistics, 5, 19351959CrossRefGoogle Scholar
Gertheiss, J., Tutz, G. (2009). Penalized regression with ordinal predictors. International Statistical Review, 77, 345365CrossRefGoogle Scholar
Gertheiss, J., Tutz, G. (2010). Sparse modeling of categorial explanatory variables. Annals of Applied Statistics, 4, 21502180CrossRefGoogle Scholar
Harville, D.A. (1977). Maximum likelihood approaches to variance component estimation and to related problems. Journal of the American Statistical Association, 72, 320338CrossRefGoogle Scholar
Hoerl, A.E., Kennard, R.W. (1970). Ridge regression: bias estimation for nonorthogonal problems. Technometrics, 12, 5567CrossRefGoogle Scholar
Hofner, B., Hothorn, T., Kneib, T., Schmid, M. (2011). A framework for unbiased model selection based on boosting. Journal of Computational and Graphical Statistics, 20, 956971CrossRefGoogle Scholar
Johnson, T. (2006). Generalized linear models with ordinally-observed covariates. British Journal of Mathematical & Statistical Psychology, 59, 275300CrossRefGoogle ScholarPubMed
Jöreskog, K. (1970). A general method for analysis of covariance structures. Biometrika, 57, 239251CrossRefGoogle Scholar
Jöreskog, K., & Sörbom, D. (1996). LISREL 8 user’s reference guide. Scientific Software.Google Scholar
LeCessie, S. (1992). Ridge estimators in logistic regression. Journal of the Royal Statistical Society. Series C. Applied Statistics, 41, 191201Google Scholar
Leitenstorfer, F., Tutz, G. (2007). Generalized monotonic regression based on B-splines with an application to air pollution data. Biostatistics, 8, 654673CrossRefGoogle ScholarPubMed
Linting, M., Meulman, J.J., van der Kooij, A.J., Groenen, P.J.F. (2007). Nonlinear principal components analysis: introduction and application. Psychological Methods, 12, 336358CrossRefGoogle ScholarPubMed
Marx, B.D., Eilers, P.H.C. (1998). Direct generalized additive modelling with penalized likelihood. Computational Statistics & Data Analysis, 28, 193209CrossRefGoogle Scholar
McCullagh, P. (1980). Regression model for ordinal data (with discussion). Journal of the Royal Statistical Society. Series B, 42, 109127CrossRefGoogle Scholar
McCullagh, P., Nelder, J.A. (1989). Generalized linear models, (2nd ed.). New York: Chapman & HallCrossRefGoogle Scholar
McCulloch, C., Searle, S. (2001). Generalized, linear, and mixed models, New York: WileyGoogle Scholar
Meier, L. (2009). grplasso: fitting user specified models with Group Lasso penalty. R package version 0.4-2. Google Scholar
Meier, L., van de Geer, S., Bühlmann, P. (2008). The group lasso for logistic regression. Journal of the Royal Statistical Society. Series B, 70, 5371CrossRefGoogle Scholar
Oberhauser, C., Escorpizo, R., Boonen, A., Stucki, G., Cieza, A. (2013). Statistical validation of the brief international classification of functioning, disability and health core set for osteoarthritis based on a large international sample of patients with osteoarthritis. Arthritis Care and Research, 65, 177186CrossRefGoogle ScholarPubMed
Osborne, M., Presnell, B., Turlach, B. (2000). On the lasso and its dual. Journal of Computational and Graphical Statistics, 9, 319337CrossRefGoogle Scholar
Park, M.Y., Hastie, T. (2007). L1 regularization path algorithm for generalized linear models. Journal of the Royal Statistical Society. Series B, 69, 659677CrossRefGoogle Scholar
Rafaeli, E., Revelle, W. (2006). A premature consensus: are happiness and sadness truly opposite affects?. Motivation and Emotion, 30, 112CrossRefGoogle Scholar
Revelle, W. (2011). psych: procedures for psychological, psychometric, and personality research, Evanston: Northwestern University R package version 1.1-12Google Scholar
Rousseeuw, P.J., Christmann, A. (2003). Robustness against separation and outliers in logistic regression. Computational Statistics & Data Analysis, 43, 315332CrossRefGoogle Scholar
Rufibach, C. (2010). An active set algorithm to estimate parameters in generalized linear models with ordered predictors. Computational Statistics & Data Analysis, 54, 14421456CrossRefGoogle Scholar
Ruppert, D., Wand, M.P., Carroll, R.J. (2003). Semiparametric regression, Cambridge: Cambridge University PressCrossRefGoogle Scholar
Santner, T.J., Duffy, D.E. (1986). A note on A. Albert and J. A. Anderson’s conditions for the existence of maximum likelihood estimates regression models. Biometrika, 73, 755758CrossRefGoogle Scholar
Segerstedt, B. (1992). On ordinary ridge regression in generalized linear models. Communications in Statistics. Theory and Methods, 21, 22272246CrossRefGoogle Scholar
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B, 58, 267288CrossRefGoogle Scholar
Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., Kneight, K. (2005). Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society. Series B, 67, 91108CrossRefGoogle Scholar
Tutz, G. (2012). Regression for categorical data, Cambridge: Cambridge University PressGoogle Scholar
Ulbricht, J. (2010). Variable selection in generalized linear models. Dissertation, Department of Statistics, Ludwig-Maximilians-Universität München, Verlag Dr. Hut. Google Scholar
Wand, M.P. (2003). Smoothing and mixed models. Computational Statistics, 18, 223249CrossRefGoogle Scholar
Wood, S.N. (2004). Stable and efficient multiple smoothing parameter estimation for generalized additive models. Journal of the American Statistical Association, 99, 673686CrossRefGoogle Scholar
Wood, S.N. (2011). Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society. Series B, 73, 336CrossRefGoogle Scholar
Young, F.W., de Leeuw, J., Takane, Y. (1976). Regression with qualitative and quantitative variables: an alternating least squares method with optimal scaling features. Psychometrika, 41, 505529CrossRefGoogle Scholar
Yuan, M., Lin, Y. (2006). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society. Series B, 68, 4967CrossRefGoogle Scholar