copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Network size and weights size for memorization with two-layers neural networks

S. Bubeck, R. Eldan, Y. Lee, and D. Mikulincer. (2020)cite arxiv:2006.02855Comment: 27 pages.

Abstract

In 1988, Eric B. Baum showed that two-layers neural networks with threshold activation function can perfectly memorize the binary labels of $n$ points in general position in $R^d$ using only $n/d \urcorner$ neurons. We observe that with ReLU networks, using four times as many neurons one can fit arbitrary real labels. Moreover, for approximate memorization up to error $\epsilon$, the neural tangent kernel can also memorize with only $Ołeft(nd łog(1/\epsilon) \right)$ neurons (assuming that the data is well dispersed too). We show however that these constructions give rise to networks where the magnitude of the neurons' weights are far from optimal. In contrast we propose a new training procedure for ReLU networks, based on complex (as opposed to real) recombination of the neurons, for which we show approximate memorization with both $Ołeft(nd \cdot łog(1/\epsilon)\epsilon\right)$ neurons, as well as nearly-optimal size of the weights.

Description

[2006.02855] Network size and weights size for memorization with two-layers neural networks

Links and resources

BibTeX key: bubeck2020network
entry type: article
year: 2020
url: http://arxiv.org/abs/2006.02855
note: cite arxiv:2006.02855Comment: 27 pages

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Network size and weights size for memorization with two-layers neural networks

Abstract

Description

Links and resources

Tags

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Network size and weights size for memorization with two-layers neural networks

Abstract

Description

Links and resources

Tags

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Network size and weights size for memorization with two-layers neural networks

Comments and Reviews
(0)