Book contents
- Frontmatter
- Contents
- Preface
- Notation Used
- Abbreviations
- 1 Introduction
- 2 Basics
- 3 Probability Distributions
- 4 Statistical Inference
- 5 Linear Regression
- 6 Neural Networks
- 7 Non-linear Optimization
- 8 Learning and Generalization
- 9 Principal Components and Canonical Correlation
- 10 Unsupervised Learning
- 11 Time Series
- 12 Classification
- 13 Kernel Methods
- 14 Decision Trees, Random Forests and Boosting
- 15 Deep Learning
- 16 Forecast Verification and Post-processing
- 17 Merging of Machine Learning and Physics
- Appendices
- References
- Index
8 - Learning and Generalization
Published online by Cambridge University Press: 23 March 2023
- Frontmatter
- Contents
- Preface
- Notation Used
- Abbreviations
- 1 Introduction
- 2 Basics
- 3 Probability Distributions
- 4 Statistical Inference
- 5 Linear Regression
- 6 Neural Networks
- 7 Non-linear Optimization
- 8 Learning and Generalization
- 9 Principal Components and Canonical Correlation
- 10 Unsupervised Learning
- 11 Time Series
- 12 Classification
- 13 Kernel Methods
- 14 Decision Trees, Random Forests and Boosting
- 15 Deep Learning
- 16 Forecast Verification and Post-processing
- 17 Merging of Machine Learning and Physics
- Appendices
- References
- Index
Summary
A good model aims to learn the underlying signal without overfitting (i.e. fitting to the noise in the data). This chapter has four main parts: The first part covers objective functions and errors. The second part covers various regularization techniques (weight penalty/decay, early stopping, ensemble, dropout, etc.) to prevent overfitting. The third part covers the Bayesian approach to model selection and model averaging. The fourth part covers the recent development of interpretable machine learning.
Keywords
- Type
- Chapter
- Information
- Introduction to Environmental Data Science , pp. 245 - 282Publisher: Cambridge University PressPrint publication year: 2023