This is CMSC389F, the University of Maryland's theoretical introduction to the art of reinforcement learning. An introductory course taught by Kevin Chen and Zack Khan, CMSC389F covers topics including markov decision processes, monte carlo methods, policy gradient methods, exploration, and application towards real environments in broad strokes .
In Model-based Reinforcement Learning, Generative And Temporal Models Of Environments Can Be Leveraged To Boost Agent Performance, Either By Tuning The Agent's Representations During Training Or Via Use As Part Of An Explicit Planning Mechanism. However, Their Application In Practice Has Been Limited To Simplistic Environments, Due To The Difficulty Of Training Such Models In Larger, Potentially Partially-observed And 3d Environments. In This Work We Introduce A Novel Action-conditioned Generative Model Of Such Challenging Environments. The Model Features A Non-parametric Spatial Memory System In Which We Store Learned, Disentangled Representations Of The Environment. Low-dimensional Spatial Updates Are Computed Using A State-space Model That Makes Use Of Knowledge On The Prior Dynamics Of The Moving Agent, And High-dimensional Visual Observations Are Modelled With A Variational Auto-encoder. The Result Is A Scalable Architecture Capable Of Performing Coherent Predictions Over Hundreds Of Time Steps Across A Range Of Partially Observed 2d And 3d Environments.
Through my PhD on Deep Learning based robotics, I read a lot of papers on Machine Learning, Reinforcement Learning and AI in general. But papers can be a bit...
Introduction to Reinforcement Learning, including a definition, analysis of the motivations and limitations of AI, and an overview of the technology along with its applications.
Asynchronous methods for deep reinforcement learning Mnih et al. ICML 2016 You know something interesting is going on when you see a scalability plot that looks like this: That’s a superlinear speedup as we increase the number of threads, giving a 24x performance improvement with 16 threads as compared to a single thread. The result…
The codebase contains a replica of the AlphaZero methodology, built in Python and Keras. Gain a deeper understanding of how AlphaZero works and adapt the code to plug in new games.
A. Zeng, S. Song, S. Welker, J. Lee, A. Rodriguez, and T. Funkhouser. (2018)cite arxiv:1803.09956Comment: Under review at the International Conference On Intelligent Robots and Systems (IROS) 2018. Project webpage: http://vpg.cs.princeton.edu.
S. Albrecht, and P. Stone. (2017)cite arxiv:1709.08071Comment: 42 pages, submitted for review to Artificial Intelligence Journal. Keywords: multiagent systems, agent modelling, opponent modelling, survey, open problems.