Author of the publication

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

X-Armed Bandits, , , and . Journal of Machine Learning Research, (June 2011)Submitted on 21/1/2010.Tuning Bandit Algorithms in Stochastic Environments, , and . ALT, page 150--165. Springer, (2007)See audibert2009 for a longer, updated version.Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path, , and . Machine Learning, 71 (1): 89--129 (April 2008)Published Online First: 14 Nov, 2007.Online Optimization in X-armed Bandits, , , and . NIPS, page 201--208. MIT Press, (2008)Finite Time Bounds for Fitted Value Iteration, and . JMLR, (2008)Reinforcement Learning for Continuous Stochastic Control Problems, and . Advances in Neural Information Processing Systems - 10, page 1029--1035. MIT Press, (1998)Fitted Q-iteration in Continuous Action-space MDPs, , and . NIPS, page 9--16. (2007)Value-iteration Based Fitted Policy Iteration: Learning with a Single Trajectory, , and . 2007 IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning (ADPRL 2007), page 330--337. IEEE, (2007)(Honolulu, Hawaii, Apr 1--5, 2007.).Finite Time Bounds for Sampling Based Fitted Value Iteration, and . ICML, page 881---886. (2005)Error Propagation for Approximate Policy and Value Iteration (extended version), , and . NIPS, (December 2010)