copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

The Geometry of Sign Gradient Descent

L. Balles, F. Pedregosa, and N. Roux. (2020)cite arxiv:2002.08056.

Abstract

Sign-based optimization methods have become popular in machine learning due to their favorable communication cost in distributed optimization and their surprisingly good performance in neural network training. Furthermore, they are closely connected to so-called adaptive gradient methods like Adam. Recent works on signSGD have used a non-standard "separable smoothness" assumption, whereas some older works study sign gradient descent as steepest descent with respect to the $\ell_ınfty$-norm. In this work, we unify these existing results by showing a close connection between separable smoothness and $\ell_ınfty$-smoothness and argue that the latter is the weaker and more natural assumption. We then proceed to study the smoothness constant with respect to the $\ell_ınfty$-norm and thereby isolate geometric properties of the objective function which affect the performance of sign-based methods. In short, we find sign-based methods to be preferable over gradient descent if (i) the Hessian is to some degree concentrated on its diagonal, and (ii) its maximal eigenvalue is much larger than the average eigenvalue. Both properties are common in deep networks.

Description

[2002.08056] The Geometry of Sign Gradient Descent

Links and resources

BibTeX key: balles2020geometry
entry type: article
year: 2020
url: http://arxiv.org/abs/2002.08056
note: cite arxiv:2002.08056

@kirk86's tags highlighted

Cite this publication

search on

Meta data

Last update 5 years ago
Created 5 years ago

Comments and Reviews
(0)

There is no review or comment yet. You can write one!

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

The Geometry of Sign Gradient Descent

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML The Geometry of Sign Gradient Descent

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

The Geometry of Sign Gradient Descent

Comments and Reviews
(0)