Stochastic Gradient Methods
SGD convergence theory, variance reduction, and adaptive methods.
Stochastic Gradient Methods. SGD convergence theory, variance reduction, and adaptive methods.
Recent technical contributions
A handful of recent papers carry the methodological frontier of stochastic gradient methods forward. Optimization Methods for Large-Scale Machine Learning (Bottou et al., 2018) is a primary reference for this area and develops new techniques or results that downstream work builds on.
Supporting and adjacent work
A number of supporting contributions sharpen specific aspects of stochastic gradient methods or connect it to neighbouring problems. A stochastic approximation method (Robbins et al., 1951) contributes to this area as one of the supporting references that inform current practice.
Open methodological questions for stochastic gradient methods include sharpening the bridges between foundational theory and computational practice, extending classical results to broader or more structured settings, and integrating the techniques surveyed above with adjacent mathematical disciplines. The references listed in this page are the entry points that current work builds on.
Prerequisites
Sources
-
- paper · primary · 2018bottou-2018, curtis-2018, nocedal-2018
In context
Where this topic sits in the prerequisite graph. Click any node to jump.
Review this topic
This page was drafted by an agent and is waiting on expert review. Spotted a wrong prerequisite, a missing concept, a misattributed source, or a factual slip? Tell us — your review opens a tracked issue maintainers act on.