Rethinking generalization requires revisiting old ideas: statistical
mechanics approaches and complex learning behavior
C. Martin, and M. Mahoney. (2017)cite arxiv:1710.09553Comment: 31 pages; added brief discussion of recent papers that use/extend these ideas.
Abstract
We describe an approach to understand the peculiar and counterintuitive
generalization properties of deep neural networks. The approach involves going
beyond worst-case theoretical capacity control frameworks that have been
popular in machine learning in recent years to revisit old ideas in the
statistical mechanics of neural networks. Within this approach, we present a
prototypical Very Simple Deep Learning (VSDL) model, whose behavior is
controlled by two control parameters, one describing an effective amount of
data, or load, on the network (that decreases when noise is added to the
input), and one with an effective temperature interpretation (that increases
when algorithms are early stopped). Using this model, we describe how a very
simple application of ideas from the statistical mechanics theory of
generalization provides a strong qualitative description of recently-observed
empirical results regarding the inability of deep neural networks not to
overfit training data, discontinuous learning and sharp transitions in the
generalization properties of learning algorithms, etc.
Description
[1710.09553] Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior
%0 Journal Article
%1 martin2017rethinking
%A Martin, Charles H.
%A Mahoney, Michael W.
%D 2017
%K bounds deep-learning generalization learning readings theory
%T Rethinking generalization requires revisiting old ideas: statistical
mechanics approaches and complex learning behavior
%U http://arxiv.org/abs/1710.09553
%X We describe an approach to understand the peculiar and counterintuitive
generalization properties of deep neural networks. The approach involves going
beyond worst-case theoretical capacity control frameworks that have been
popular in machine learning in recent years to revisit old ideas in the
statistical mechanics of neural networks. Within this approach, we present a
prototypical Very Simple Deep Learning (VSDL) model, whose behavior is
controlled by two control parameters, one describing an effective amount of
data, or load, on the network (that decreases when noise is added to the
input), and one with an effective temperature interpretation (that increases
when algorithms are early stopped). Using this model, we describe how a very
simple application of ideas from the statistical mechanics theory of
generalization provides a strong qualitative description of recently-observed
empirical results regarding the inability of deep neural networks not to
overfit training data, discontinuous learning and sharp transitions in the
generalization properties of learning algorithms, etc.
@article{martin2017rethinking,
abstract = {We describe an approach to understand the peculiar and counterintuitive
generalization properties of deep neural networks. The approach involves going
beyond worst-case theoretical capacity control frameworks that have been
popular in machine learning in recent years to revisit old ideas in the
statistical mechanics of neural networks. Within this approach, we present a
prototypical Very Simple Deep Learning (VSDL) model, whose behavior is
controlled by two control parameters, one describing an effective amount of
data, or load, on the network (that decreases when noise is added to the
input), and one with an effective temperature interpretation (that increases
when algorithms are early stopped). Using this model, we describe how a very
simple application of ideas from the statistical mechanics theory of
generalization provides a strong qualitative description of recently-observed
empirical results regarding the inability of deep neural networks not to
overfit training data, discontinuous learning and sharp transitions in the
generalization properties of learning algorithms, etc.},
added-at = {2019-10-08T16:21:36.000+0200},
author = {Martin, Charles H. and Mahoney, Michael W.},
biburl = {https://www.bibsonomy.org/bibtex/2daf76a2344c173b4cb553901e8c9e3e2/kirk86},
description = {[1710.09553] Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior},
interhash = {d09618b9609687bcaed1deb84aa4f9b8},
intrahash = {daf76a2344c173b4cb553901e8c9e3e2},
keywords = {bounds deep-learning generalization learning readings theory},
note = {cite arxiv:1710.09553Comment: 31 pages; added brief discussion of recent papers that use/extend these ideas},
timestamp = {2019-10-08T16:22:46.000+0200},
title = {Rethinking generalization requires revisiting old ideas: statistical
mechanics approaches and complex learning behavior},
url = {http://arxiv.org/abs/1710.09553},
year = 2017
}