On Early Stopping in Gradient Descent Learning

Abstract

In this paper we study a family of gradient descent algorithms to approximate the regression function from reproducing kernel Hilbert spaces (RKHSs), the family being characterized by a polynomial decreasing rate of step sizes (or learning rate). By solving a bias-variance trade-off we obtain an early stopping rule and someprobabilistic upper bounds for the convergence of the algorithms. We also discuss the implication of these results in the context of classification where some fast convergence rates can be achieved for plug-in classifiers. Some connections are addressed with Boosting, Landweber iterations, and the online learning algorithms as stochastic approximations of the gradient descent method.

BibTeX key: Yao2007
entry type: article
year: 2007
month: aug
day: 01
journal: Constructive Approximation
number: 2
pages: 289--315
volume: 26
issn: 1432-0940
DOI: 10.1007/s00365-006-0663-2
url: https://doi.org/10.1007/s00365-006-0663-2

BibSonomy

On Early Stopping in Gradient Descent Learning

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on