From post

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed.

 

Другие публикации лиц с тем же именем

The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning, , , , , и . ICLR, (2017)cite arxiv:1704.04651.Convex Relaxation Regression: Black-Box Optimization of Smooth Functions by Learning Their Convex Envelopes., , и . UAI, AUAI Press, (2016)Rainbow: Combining Improvements in Deep Reinforcement Learning., , , , , , , , , и . AAAI, стр. 3215-3222. AAAI Press, (2018)Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice., , , , , , , , , и 5 other автор(ы). ICML, том 202 из Proceedings of Machine Learning Research, стр. 17135-17175. PMLR, (2023)Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion., , , , , , , , , и . CoRR, (2024)Averaging log-likelihoods in direct alignment., , , , , , , , , и 1 other автор(ы). CoRR, (2024)Neural Predictive Belief Representations., , , , , и . CoRR, (2018)Meta-learning of Sequential Strategies., , , , , , , , , и 14 other автор(ы). CoRR, (2019)Fast computation of Nash Equilibria in Imperfect Information Games., , , , , , , , , и 3 other автор(ы). ICML, том 119 из Proceedings of Machine Learning Research, стр. 7119-7129. PMLR, (2020)Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning., , , , , , и . ICML, том 119 из Proceedings of Machine Learning Research, стр. 3875-3886. PMLR, (2020)