Аннотация
Can multilayer neural networks -- typically constructed as highly complex
structures with many nonlinearly activated neurons across layers -- behave in a
non-trivial way that yet simplifies away a major part of their complexities? In
this work, we uncover a phenomenon in which the behavior of these complex
networks -- under suitable scalings and stochastic gradient descent dynamics --
becomes independent of the number of neurons as this number grows sufficiently
large. We develop a formalism in which this many-neurons limiting behavior is
captured by a set of equations, thereby exposing a previously unknown operating
regime of these networks. While the current pursuit is mathematically
non-rigorous, it is complemented with several experiments that validate the
existence of this behavior.
Пользователи данного ресурса
Пожалуйста,
войдите в систему, чтобы принять участие в дискуссии (добавить собственные рецензию, или комментарий)