Self-organizing neural systems based on predictive learning
R. Rao, и T. Sejnowski. Philosophical Transactions of the Royal Society A: Mathematical,
Physical and Engineering Sciences, (2003)
Аннотация
The ability to predict future events based on the past is an important
attribute of organisms that engage in adaptive behaviour. One prominent
computational method for learning to predict is called temporal-difference
(TD) learning. It is so named because it uses the difference between
successive predictions to learn to predict correctly. TD learning
is well suited to modelling the biological phenomenon of conditioning,
wherein an organism learns to predict a reward even though the reward
may occur later in time. We review a model for conditioning in bees
based on TD learning. The model illustrates how the TD-learning algorithm
allows an organism to learn an appropriate sequence of actions leading
up to a reward, based solely on reinforcement signals. The second
part of the paper describes how TD learning can be used at the cellular
level to model the recently discovered phenomenon of spike-timing-dependent
plasticity. Using a biophysical model of a neocortical neuron, we
demonstrate that the shape of the spike-timing-dependent learning
windows found in biology can be interpreted as a form of TD learning
occurring at the cellular level. We conclude by showing that such
spike-based TD-learning mechanisms can produce direction selectivity
in visual-motion-sensitive cells and can endow recurrent neocortical
circuits with the powerful ability to predict their inputs at the
millisecond time-scale.
%0 Journal Article
%1 Rao:2003
%A Rao, Rajesh P. N.
%A Sejnowski, Terrence J.
%D 2003
%J Philosophical Transactions of the Royal Society A: Mathematical,
Physical and Engineering Sciences
%K Cerebral Conditioning, Cortex, Neuroscience, Perception, Plasticity, Prediction Synaptic Visual
%P 1149 - 1175
%T Self-organizing neural systems based on predictive learning
%V 361
%X The ability to predict future events based on the past is an important
attribute of organisms that engage in adaptive behaviour. One prominent
computational method for learning to predict is called temporal-difference
(TD) learning. It is so named because it uses the difference between
successive predictions to learn to predict correctly. TD learning
is well suited to modelling the biological phenomenon of conditioning,
wherein an organism learns to predict a reward even though the reward
may occur later in time. We review a model for conditioning in bees
based on TD learning. The model illustrates how the TD-learning algorithm
allows an organism to learn an appropriate sequence of actions leading
up to a reward, based solely on reinforcement signals. The second
part of the paper describes how TD learning can be used at the cellular
level to model the recently discovered phenomenon of spike-timing-dependent
plasticity. Using a biophysical model of a neocortical neuron, we
demonstrate that the shape of the spike-timing-dependent learning
windows found in biology can be interpreted as a form of TD learning
occurring at the cellular level. We conclude by showing that such
spike-based TD-learning mechanisms can produce direction selectivity
in visual-motion-sensitive cells and can endow recurrent neocortical
circuits with the powerful ability to predict their inputs at the
millisecond time-scale.
@article{Rao:2003,
abstract = {The ability to predict future events based on the past is an important
attribute of organisms that engage in adaptive behaviour. One prominent
computational method for learning to predict is called temporal-difference
(TD) learning. It is so named because it uses the difference between
successive predictions to learn to predict correctly. TD learning
is well suited to modelling the biological phenomenon of conditioning,
wherein an organism learns to predict a reward even though the reward
may occur later in time. We review a model for conditioning in bees
based on TD learning. The model illustrates how the TD-learning algorithm
allows an organism to learn an appropriate sequence of actions leading
up to a reward, based solely on reinforcement signals. The second
part of the paper describes how TD learning can be used at the cellular
level to model the recently discovered phenomenon of spike-timing-dependent
plasticity. Using a biophysical model of a neocortical neuron, we
demonstrate that the shape of the spike-timing-dependent learning
windows found in biology can be interpreted as a form of TD learning
occurring at the cellular level. We conclude by showing that such
spike-based TD-learning mechanisms can produce direction selectivity
in visual-motion-sensitive cells and can endow recurrent neocortical
circuits with the powerful ability to predict their inputs at the
millisecond time-scale.},
added-at = {2009-06-26T15:25:19.000+0200},
author = {Rao, Rajesh P. N. and Sejnowski, Terrence J.},
biburl = {https://www.bibsonomy.org/bibtex/2f34171d96b2d73c9a327417442f6637a/butz},
description = {diverse cognitive systems bib},
interhash = {3b8644d4f7ba44ffa7db0959fddf6ca1},
intrahash = {f34171d96b2d73c9a327417442f6637a},
journal = {Philosophical Transactions of the Royal Society A: Mathematical,
Physical and Engineering Sciences},
keywords = {Cerebral Conditioning, Cortex, Neuroscience, Perception, Plasticity, Prediction Synaptic Visual},
owner = {martin},
pages = {1149 - 1175},
timestamp = {2009-06-26T15:25:51.000+0200},
title = {Self-organizing neural systems based on predictive learning},
volume = { 361},
year = 2003
}