Author of the publication

Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions

, , , , and . NIPS, page 2508--2516. (December 2013)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Mixing Time Estimation in Reversible Markov Chains from a Single Sample Path, , , , , and . Annals of Applied Probability, 29 (4): 2439--2480 (July 2019)Deterministic Independent Component Analysis, , and . ICML, page 2521--2530. (2015)Partial monitoring -- classification, regret bounds, and algorithms, , , , and . Mathematics of Operations Research, (2014)Policy Error Bounds for Model-Based Reinforcement Learning with Factored Linear Models, and . COLT, page 121--151. (2016)Error Propagation for Approximate Policy and Value Iteration (extended version), , and . NIPS, (December 2010)Cleaning up the neighborhood: A full classification for adversarial partial monitoring, and . ALT, (February 2019)Toward Minimax Off-policy Value Estimation, , and . AISTATS, page 608--616. (2015)An Information-Theoretic Approach to Minimax Regret in Partial Monitoring, and . COLT, (April 2019)Multi-view Matrix Factorization for Linear Dynamical System Estimation, , , and . NIPS, page 7092--7101. (2017)Sequential Learning for Multi-channel Wireless Network Monitoring with Channel Switching Costs, , and . IEEE Transactions on Signal Processing, (September 2014)