Hostname: page-component-78c5997874-j824f Total loading time: 0 Render date: 2024-11-13T04:12:44.321Z Has data issue: false hasContentIssue false

LEARNING GRADIENTS FROM NONIDENTICAL DATA

Published online by Cambridge University Press:  07 March 2017

XUE-MEI DONG*
Affiliation:
School of Statistics and Mathematics, Zhejiang Gongshang University, Hangzhou, 310018, China email dongxuemei@zjgsu.edu.cn
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Selecting important variables and estimating coordinate covariation have received considerable attention in the current big data deluge. Previous work shows that the gradient of the regression function, the objective function in regression and classification problems, can provide both types of information. In this paper, an algorithm to learn this gradient function is proposed for nonidentical data. Under some mild assumptions on data distribution and the model parameters, a result on its learning rate is established which provides a theoretical guarantee for using this method in dynamical gene selection and in network security for recognition of malicious online attacks.

Type
Research Article
Copyright
© 2017 Australian Mathematical Society 

References

Aronszajn, N., “Theory of reproducing kernels”, Trans. Amer. Math. Soc. 68 (1950) 337404; doi:10.2307/1990404.Google Scholar
Bickel, P. and Levina, E., “Covariance regularization by thresholding”, Ann. Statist. 36 (2008) 25772604; doi:10.1214/08-AOS600.CrossRefGoogle Scholar
Cai, T. and Liu, W., “Adaptive thresholding for sparse covariance matrix estimation”, J. Amer. Statist. Assoc. 106 (2011) 672684; doi:10.1198/jasa.2011.tm10560.Google Scholar
Guyon, I. and Ellsseeff, A., “An introduction to variable and feature selection”, J. Mach. Learn. Res. 3 (2003) 11571182; doi:10.1162/153244303322753616.Google Scholar
Koski, T., Hidden Markov models for bioinformatics (Springer, Netherlands, 2001).Google Scholar
Lax, P. D., Functional analysis (John Wiley & Sons, New York, 2002).Google Scholar
Mukherjee, S. and Zhou, D. X., “Learning coordinate covariances via gradients”, J. Mach. Learn. Res. 7 (2006) 519549; http://www.jmlr.org/papers/volume7/mukherjee06a/mukherjee06a.pdf.Google Scholar
Raftery, A. E., Madigan, D. and Hoeting, J. A., “An introduction to variable and feature selection”, J. Amer. Statist. Assoc. 92 (1998) 179191; doi:10.1080/01621459.1997.10473615.Google Scholar
Robinson, C., Dynamical systems: stability, symbolic dynamics, and chaos (CRC Press, New York, 1998).CrossRefGoogle Scholar
Smale, S. and Zhou, D.-X., “Online learning with Markov sampling”, Anal. Appl. 7 (2009) 87113; doi:10.1142/S0219530509001293.CrossRefGoogle Scholar
Steinwart, I., Hush, D. and Scovel, C., “Learning from dependent observations”, J. Multivariate Anal. 100 (2009) 175194; doi:10.1016/j.jmva.2008.04.001.Google Scholar
Tibshirani, R., “Regression shrinkage and selection via the lasso”, J. R. Stat. Soc. Ser. B Stat. Methodol. 58 (1996) 267288; doi:10.2307/2346178.Google Scholar
Ye, G. B. and Xie, X., “Learning sparse gradients for variable selection and dimension reduction”, Mach. Learn. 87 (2012) 303355; doi:10.1007/s10994-012-5284-9.CrossRefGoogle Scholar
Zou, H. and Hastie, T., “Regularization and variable selection via the elastic net”, J. R. Stat. Soc. Ser. B, Stat. Methodol. 67 (2005) 301320; doi:10.1111/j.1467-9868.2005.00503.x.Google Scholar