Author of the publication

Off-Policy Policy Gradient with Stationary Distribution Correction.

, , , and . UAI, volume 115 of Proceedings of Machine Learning Research, page 1180-1190. AUAI Press, (2019)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Model-based RL in Contextual Decision Processes: PAC bounds and Exponential Improvements over Model-free Approaches, , , , and . (2018)cite arxiv:1811.08540Comment: COLT 2019.Off-Policy Policy Gradient with State Distribution Correction., , , and . CoRR, (2019)Stochastic optimization and sparse statistical recovery: An optimal algorithm for high dimensions., , and . CISS, page 1-2. IEEE, (2014)Fast global convergence of gradient methods for high-dimensional statistical recovery, , and . CoRR, (2011)On the Optimality of Sparse Model-Based Planning for Markov Decision Processes., , and . CoRR, (2019)Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback., , , , and . ICML, volume 97 of Proceedings of Machine Learning Research, page 7335-7344. PMLR, (2019)Optimizing Interactive Systems via Data-Driven Objectives., , , , and . CoRR, (2020)Noisy matrix decomposition via convex relaxation: Optimal rates in high dimensions., , and . ICML, page 1129-1136. Omnipress, (2011)Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes., , , and . COLT, volume 125 of Proceedings of Machine Learning Research, page 64-66. PMLR, (2020)Message-passing for Graph-structured Linear Programs: Proximal Methods and Rounding Schemes., , and . J. Mach. Learn. Res., (2010)