From post

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed.

 

Другие публикации лиц с тем же именем

Boosted and reward-regularized classification for apprenticeship learning., , и . AAMAS, стр. 1249-1256. IFAAMAS/ACM, (2014)The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning, , , , , и . ICLR, (2017)cite arxiv:1704.04651.Rainbow: Combining Improvements in Deep Reinforcement Learning, , , , , , , , , и . (2017)cite arxiv:1710.02298Comment: Under review as a conference paper at AAAI 2018.Rainbow: Combining Improvements in Deep Reinforcement Learning., , , , , , , , , и . AAAI, стр. 3215-3222. AAAI Press, (2018)Building Math Agents with Multi-Turn Iterative Preference Learning., , , , , , , , , и 3 other автор(ы). CoRR, (2024)Nash Learning from Human Feedback., , , , , , , , , и 8 other автор(ы). ICML, OpenReview.net, (2024)Learning Nash Equilibrium for General-Sum Markov Games from Batch Data., , , и . AISTATS, том 54 из Proceedings of Machine Learning Research, стр. 232-241. PMLR, (2017)End-to-end optimization of goal-driven and visually grounded dialogue systems., , , , , и . IJCAI, стр. 2765-2771. ijcai.org, (2017)Difference of Convex Functions Programming for Reinforcement Learning., , и . NIPS, стр. 2519-2527. (2014)Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning., , , , , , , , , и 4 other автор(ы). NeurIPS, (2020)