Prediction and Classification in Nonlinear Data Analysis: Something Old, Something New, Something Borrowed, Something Blue

Jacqueline J. Meulman

doi:10.1007/BF02295607

Prediction and Classification in Nonlinear Data Analysis: Something Old, Something New, Something Borrowed, Something Blue

Published online by Cambridge University Press: 01 January 2025

Jacqueline J. Meulman

Show author details

Jacqueline J. Meulman*: Affiliation:
Leiden University
*: Requests for reprints should be sent to Jacqueline J. Meulman, Data Theory Group, Department of Education, Leiden University, P.O. Box 9555, 2300 RB Leiden, THE NETHERLANDS. E-Mail: meulman@fsw.leidenuniv.nl

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Prediction and classification are two very active areas in modern data analysis. In this paper, prediction with nonlinear optimal scaling transformations of the variables is reviewed, and extended to the use of multiple additive components, much in the spirit of statistical learning techniques that are currently popular, among other areas, in data mining. Also, a classification/clustering method is described that is particularly suitable for analyzing attribute-value data from systems biology (genomics, proteomics, and metabolomics), and which is able to detect groups of objects that have similar values on small subsets of the attributes.

Information

Type: 2003 Presidential Address
Information: Psychometrika , Volume 68 , Issue 4 , December 2003 , pp. 493 - 517

DOI: https://doi.org/10.1007/BF02295607 [Opens in a new window]
Copyright: Copyright © 2003 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

Special thanks are due to Brian Junker who gave me very helpful comments, and to Tim Null who made the printed version look as good as it does. Both waited patiently for me to finish, for which I'm forever grateful.

This article is based on the Presidential Address Jacqueline Meulman gave on July 9, 2003 at the 68th Annual Meeting of the Psychometric Society held near Cagliari, Italy on the island of Sardinia.—Editor

References

Bock, R.D. (1960). Methods and applications of optimal scaling. Chapel Hill, NC: University of North Carolina, L.L. Thurstone Psychometric Laboratory.Google Scholar

Boon, M.E., Zeppa, P., Ouwerkerk-Noordam, E., Kok, L.P. (1990). Exploiting the tooth-pick effect of the cytobrush by plastic embedding of cervical samples. Acta Cytologica, 35, 57–63.Google Scholar

Breiman, L. (1996). Bagging predictors. Machine Learning, 26, 123–140.CrossRef Google Scholar

Breiman, L. (1996). Stacked regressions. Machine Learning, 24, 51–64.CrossRef Google Scholar

Breiman, L., Friedman, J.H. (1985). Estimating optimal transformations for multiple regression and correlation. Journal of the American Statistical Association, 80, 580–598.CrossRef Google Scholar

Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J. (1984). Classification and regression trees. Belmont, CA: Wadsworth.Google Scholar

Buja, A. (1990). Remarks on functional canonical variates, alternating least squares methods and ACE. Annals of Statistics, 18, 1032–1069.CrossRef Google Scholar

de Leeuw, J., Heiser, W.J. (1980). Multidimensional scaling with restrictions on the configuration. In Krishnaiah, P.R. (Eds.), Multivariate analysis, Vol. V (pp. 501–522). Amsterdam: North-Holland.Google Scholar

de Leeuw, J., Young, F.W., Takane, Y. (1976). Additive structure in qualitative data. Psychometrika, 41, 471–503.CrossRef Google Scholar

Duda, R., Hart, P., Stork, D. (2000). Pattern classification 2nd ed., New York, NY: John Wiley & Sons.Google Scholar

Freund, Y., Schapire, R.E. (1996). Experiments with a new boosting algorithm. Machine Learning: Proceedings of the Thirteenth International Conference (pp. 148–156). San Francisco, CA: Morgan Kauffman.Google Scholar

Friedman, J.H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232.CrossRef Google Scholar

Friedman, J.H., Hastie, T., Tibshirani, R. (2000). Additive logistic regression: A statistical view of boosting (with discussion). Annals of Statistics, 28, 337–307.CrossRef Google Scholar

Friedman, J.H., & Meulman, J.J. (in press). Clustering objects on subsets of attributes, (with discussion). Journal of the Royal Statistical Society, Series B. Available at http://www-stat.stanford.edu/~jhf/ftp/cosa.pdfGoogle Scholar

Friedman, J.H., Meulman, J.J. (2003). Multiple additive regression trees with application in epidemiology. Statistics in Medicine, 22(9), 1365–1381.CrossRef Google Scholar PubMed

Friedman, J.H., & Meulman, J.J. (2003b). COSA [Software]. Available at http://www-stat.stanford.edu/~jhf/COSA.htmlGoogle Scholar

Friedman, J., Stuetzle, W. (1981). Projection pursuit regression. Journal of the American Statistical Association, 76, 817–823.CrossRef Google Scholar

Gifi, A. (1990). Nonlinear multivariate analysis First edition, Chichester, U.K.: John Wiley & Sons.Google Scholar

Groenen, P.J.F., van Os, B.J., Meulman, J.J. (2000). Optimal scaling by alternating length constrained nonnegative least squares: An application to distance based principal components analysis. Psychometrika, 65, 511–524.CrossRef Google Scholar

Guttman, L. (1950). The principal components of scale analysis. In Stouffer, S.A., Guttman, L., Suchman, E.A., Lazarsfield, P.F., Star, S.A., Clausen, J.A. (Eds.), Measurement and prediction. Princeton, NJ: Princeton University Press.Google Scholar

Harrison, D., Rubinfeld, D.L. (1978). Hedonic housing prices and the demand for clean air. Journal of Environmental Economics Management, 5, 81–102.CrossRef Google Scholar

Hastie, T., Tibshirani, R. (1990). Generalized additive models. New York, NY: Chapman and Hall.Google Scholar

Hastie, T., Tibshirani, R., Buja, A. (1998). Flexible discriminant analysis by optimal scoring. Journal of the American Statistical Association, 89, 1255–1270.CrossRef Google Scholar

Hastie, T., Tibshirani, R., Friedman, J.H. (2001). The elements of statistical learning. New York, NY: Springer-Verlag.CrossRef Google Scholar

Hayashi, C. (1952). On the prediction of phenomena from qualitative data and the quantification of qualitative data from the mathematico-statistical point of view. Annals of the Institute of Statitical Mathematics, 2, 93–96.Google Scholar

Heiser, W.J. (1995). Convergent computation by iterative majorization: Theory and applications in multidimensional data analysis. In Krzanowski, W.J. (Eds.), Recent advances in descriptive multivariate analysis (pp. 157–189). Oxford, U.K.: Oxford University Press.CrossRef Google Scholar

Kruskal, J.B. (1964). Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29, 1–28.CrossRef Google Scholar

Kruskal, J.B. (1964). Nonmetric multidimensional scaling: A numerical method. Psychometrika, 29, 115–129.CrossRef Google Scholar

Kruskal, J.B. (1965). Analysis of factorial experiments by estimating monotone transformations of the data. Journal of the Royal Statistical Society, 27, 251–263.CrossRef Google Scholar

Max, J. (1960). Quantizing for minimum distortion. Proceedings IEEE (Information Theory), 6, 7–12.CrossRef Google Scholar

McLachlan, G.J. (1992). Discriminant analysis and statistical pattern recognition. New York, NY: John Wiley & Sons.CrossRef Google Scholar

Meulman, J.J. (2000). Discriminant analysis with optimal scaling. In Decker, R., Gaul, W. (Eds.), Classification and information processing at the turn of the millenium (pp. 32–39). Heidelberg-Berlin, Germany: Springer-Verlag.CrossRef Google Scholar

Meulman, J.J., Zeppa, P., Boon, M.E., Rietveld, W.J. (1992). Prediction of various grades of cervical preneoplasia and neoplasia on plastic embedded cytobrush samples: Discriminant analysis with qualitative and quantitative predictors. Analytical and Quantitative Cytology and Histology, 14, 60–72.Google Scholar PubMed

Meulman, J.J., & van der Kooij, A.J. (2000, May). Transformations towards independence through optimal scaling. Paper presented at the International Conference on Measurement and Multivariate Analysis (ICMMA), Banff, Canada.Google Scholar

Nishisato, S. (1980). Analysis of categorical data: Dual scaling and its applications. Toronto, Canada: University of Toronto Press.CrossRef Google Scholar

Nishisato, S. (1994). Elements of dual scaling: An introduction to practical data analysis. Hillsdale, NJ: Lawrence Erlbaum.Google Scholar

Ramsay, J.O. (1988). Monotone regression splines in action. Statistical Science, 4, 425–461.Google Scholar

Ripley, B.D. (1996). Pattern recognition and neural networks. Cambridge, U.K.: Cambridge University Press.CrossRef Google Scholar

Takane, Y. (1998). Nonlinear multivariate analysis by neural network models. In Hayashi, C., Ohsumi, N., Yajima, K., Tanaka, Y., Bock, H.H., Baba, Y. (Eds.), Data science, classification, and related methods (pp. 527–538). Tokyo: Springer.CrossRef Google Scholar

Takane, Y., Oshima-Takane, Y. (2002). Nonlinear generalized canonical correlation analysis by neural network models. In Nishisato, S., Baba, Y., Bozdogan, H., Kanefuji, K. (Eds.), Measurement and multivariate analysis (pp. 183–190). Tokyo: Springer-Verlag.CrossRef Google Scholar

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, 58, 267–288.CrossRef Google Scholar

van der Greef, J., Davidov, E., Verheij, E., Vogels, J., van der Heijden, R., Adourian, A.S., Oresic, M., Marple, E.W., Naylor, S. (2003). The role of metabolomics in drug discovery: A new vision for drug discovery and development. In Harrigan, G.G., Goodacre, R. (Eds.), Metabolic profiling: Its role in biomarker discovery and gene function analysis (pp. 170–198). Boston, MA: Dordrecht; London: Kluwer Academic Publishers.Google Scholar

van der Kooij, A.J., Meulman, J.J. (1999). Regression with optimal scaling. In Meulman, J.J., Heiser, W.J.SPSS Inc. (Eds.), SPPS Categories 10.0 (pp. 1–8). Chicago, IL: SPSS.Google Scholar

van der Kooij, A.J., Meulman, J.J., & Heiser, W.J. (2003). Local minima in categorical multiple regression. Manuscript mubmitted for publication.Google Scholar

Vapnik, V. (1996). The nature of statistical learning theory. New York, NY: Springer-Verlag.Google Scholar

Whittaker, J.L. (1990). Graphical models in applied multivariate statistics. New York, NY: John Wiley & Sons.Google Scholar

Winsberg, S., Ramsay, J.O. (1980). Monotonic transformations to additivity using splines. Biometrika, 67, 669–674.CrossRef Google Scholar

Yanai, H., Okada, A., Shigemasu, K., Kano, T., Meulman, J.J. (2003). New developments in psychometrics. Tokyo: Springer-Verlag.CrossRef Google Scholar

Young, F.W., de Leeuw, J., Takane, Y. (1976). Regression with qualitative and quantitative variables: An alternating least squares method with optimal scaling features. Psychometrika, 41, 505–528.CrossRef Google Scholar

Article contents

Prediction and Classification in Nonlinear Data Analysis: Something Old, Something New, Something Borrowed, Something Blue

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

Footnotes

References

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Article contents

Prediction and Classification in Nonlinear Data Analysis: Something Old, Something New, Something Borrowed, Something Blue

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests