Generative models for deep learning are promising both to improve
understanding of the model, and yield training methods requiring fewer labeled
samples.
Recent works use generative model approaches to produce the deep net's input
given the value of a hidden layer several levels above. However, there is no
accompanying "proof of correctness" for the generative model, showing that the
feedforward deep net is the correct inference method for recovering the hidden
layer given the input. Furthermore, these models are complicated.
The current paper takes a more theoretical tack. It presents a very simple
generative model for RELU deep nets, with the following characteristics: (i)
The generative model is just the reverse of the feedforward net: if the forward
transformation at a layer is $A$ then the reverse transformation is $A^T$.
(This can be seen as an explanation of the old weight tying idea for denoising
autoencoders.) (ii) Its correctness can be proven under a clean theoretical
assumption: the edge weights in real-life deep nets behave like random numbers.
Under this assumption ---which is experimentally tested on real-life nets like
AlexNet--- it is formally proved that feed forward net is a correct inference
method for recovering the hidden layer.
The generative model suggests a simple modification for training: use the
generative model to produce synthetic data with labels and include it in the
training set. Experiments are shown to support this theory of random-like deep
nets; and that it helps the training.
Описание
[1511.05653] Why are deep nets reversible: A simple theory, with implications for training
%0 Journal Article
%1 arora2015reversible
%A Arora, Sanjeev
%A Liang, Yingyu
%A Ma, Tengyu
%D 2015
%K generalization learning readings theory
%T Why are deep nets reversible: A simple theory, with implications for
training
%U http://arxiv.org/abs/1511.05653
%X Generative models for deep learning are promising both to improve
understanding of the model, and yield training methods requiring fewer labeled
samples.
Recent works use generative model approaches to produce the deep net's input
given the value of a hidden layer several levels above. However, there is no
accompanying "proof of correctness" for the generative model, showing that the
feedforward deep net is the correct inference method for recovering the hidden
layer given the input. Furthermore, these models are complicated.
The current paper takes a more theoretical tack. It presents a very simple
generative model for RELU deep nets, with the following characteristics: (i)
The generative model is just the reverse of the feedforward net: if the forward
transformation at a layer is $A$ then the reverse transformation is $A^T$.
(This can be seen as an explanation of the old weight tying idea for denoising
autoencoders.) (ii) Its correctness can be proven under a clean theoretical
assumption: the edge weights in real-life deep nets behave like random numbers.
Under this assumption ---which is experimentally tested on real-life nets like
AlexNet--- it is formally proved that feed forward net is a correct inference
method for recovering the hidden layer.
The generative model suggests a simple modification for training: use the
generative model to produce synthetic data with labels and include it in the
training set. Experiments are shown to support this theory of random-like deep
nets; and that it helps the training.
@article{arora2015reversible,
abstract = {Generative models for deep learning are promising both to improve
understanding of the model, and yield training methods requiring fewer labeled
samples.
Recent works use generative model approaches to produce the deep net's input
given the value of a hidden layer several levels above. However, there is no
accompanying "proof of correctness" for the generative model, showing that the
feedforward deep net is the correct inference method for recovering the hidden
layer given the input. Furthermore, these models are complicated.
The current paper takes a more theoretical tack. It presents a very simple
generative model for RELU deep nets, with the following characteristics: (i)
The generative model is just the reverse of the feedforward net: if the forward
transformation at a layer is $A$ then the reverse transformation is $A^T$.
(This can be seen as an explanation of the old weight tying idea for denoising
autoencoders.) (ii) Its correctness can be proven under a clean theoretical
assumption: the edge weights in real-life deep nets behave like random numbers.
Under this assumption ---which is experimentally tested on real-life nets like
AlexNet--- it is formally proved that feed forward net is a correct inference
method for recovering the hidden layer.
The generative model suggests a simple modification for training: use the
generative model to produce synthetic data with labels and include it in the
training set. Experiments are shown to support this theory of random-like deep
nets; and that it helps the training.},
added-at = {2019-11-01T15:16:18.000+0100},
author = {Arora, Sanjeev and Liang, Yingyu and Ma, Tengyu},
biburl = {https://www.bibsonomy.org/bibtex/2ce748aaad609f36cf0985009054050dd/kirk86},
description = {[1511.05653] Why are deep nets reversible: A simple theory, with implications for training},
interhash = {dba8999be1b8967e11d73d2af43296a9},
intrahash = {ce748aaad609f36cf0985009054050dd},
keywords = {generalization learning readings theory},
note = {cite arxiv:1511.05653},
timestamp = {2019-11-01T15:16:18.000+0100},
title = {Why are deep nets reversible: A simple theory, with implications for
training},
url = {http://arxiv.org/abs/1511.05653},
year = 2015
}