Book contents
- Frontmatter
- Contents
- Preface
- 1 Introduction
- 2 Foundations of Smooth Optimization
- 3 Descent Methods
- 4 Gradient Methods Using Momentum
- 5 Stochastic Gradient
- 6 Coordinate Descent
- 7 First-Order Methods for Constrained Optimization
- 8 Nonsmooth Functions and Subgradients
- 9 Nonsmooth Optimization Methods
- 10 Duality and Algorithms
- 11 Differentiation and Adjoints
- Appendix
- Bibliography
- Index
5 - Stochastic Gradient
Published online by Cambridge University Press: 31 March 2022
- Frontmatter
- Contents
- Preface
- 1 Introduction
- 2 Foundations of Smooth Optimization
- 3 Descent Methods
- 4 Gradient Methods Using Momentum
- 5 Stochastic Gradient
- 6 Coordinate Descent
- 7 First-Order Methods for Constrained Optimization
- 8 Nonsmooth Functions and Subgradients
- 9 Nonsmooth Optimization Methods
- 10 Duality and Algorithms
- 11 Differentiation and Adjoints
- Appendix
- Bibliography
- Index
Summary
We describe the stochastic gradient method, the fundamental algorithm for several important problems in data science, including deep learning. We give several example problems for which this method is suitable, then described its operation for the simple problem of computing a mean of a collection of values. We related it to a classical method, the Kaczmarz method for solving a system of linear equalities and inequalities. Next, we describe the key assumptions to be used in convergence analysis, then describe the convergence rates attainable by several variants of stochastic gradient under several scenarios. Finally, we discuss several aspects of practical implementation of stochastic gradient, including minibatching and acceleration.
Keywords
- Type
- Chapter
- Information
- Optimization for Data Analysis , pp. 75 - 99Publisher: Cambridge University PressPrint publication year: 2022