Author of the publication

Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes.

, , , , and . ICML, volume 119 of Proceedings of Machine Learning Research, page 10170-10180. PMLR, (2020)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Tractable Local Equilibria in Non-Concave Games., , , , and . CoRR, (2024)Improved High-Probability Regret for Adversarial Bandits with Time-Varying Feedback Graphs., , , and . ALT, volume 201 of Proceedings of Machine Learning Research, page 1074-1100. PMLR, (2023)Posterior sampling-based online learning for the stochastic shortest path model., , , and . UAI, volume 216 of Proceedings of Machine Learning Research, page 922-931. PMLR, (2023)Optimal and Adaptive Algorithms for Online Boosting., , and . ICML, volume 37 of JMLR Workshop and Conference Proceedings, page 2323-2331. JMLR.org, (2015)A New Algorithm for Non-stationary Contextual Bandits: Efficient, Optimal and Parameter-free., , , and . COLT, volume 99 of Proceedings of Machine Learning Research, page 696-726. PMLR, (2019)Fair Contextual Multi-Armed Bandits: Theory and Experiments., , , , , and . CoRR, (2019)Finding the Stochastic Shortest Path with Low Regret: the Adversarial Cost and Unknown Transition Case., and . ICML, volume 139 of Proceedings of Machine Learning Research, page 1651-1660. PMLR, (2021)Improved No-Regret Algorithms for Stochastic Shortest Path with Linear MDP., , and . ICML, volume 162 of Proceedings of Machine Learning Research, page 3204-3245. PMLR, (2022)Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition., , , , and . ICML, volume 119 of Proceedings of Machine Learning Research, page 4860-4869. PMLR, (2020)No-Regret Learning in Two-Echelon Supply Chain with Unknown Demand Distribution., , , and . AISTATS, volume 206 of Proceedings of Machine Learning Research, page 3270-3298. PMLR, (2023)