From post

копировать удалить добавить публикацию в буфер
Запись сообщества
посмотреть историю данной записи
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Nash Learning from Human Feedback.

R. Munos, M. Valko, D. Calandriello, M. Azar, M. Rowland, Z. Guo, Y. Tang, M. Geist, T. Mesnard, A. Michi, M. Selvi, S. Girgin, N. Momchev, O. Bachem, D. Mankowitz, D. Precup, и B. Piot. CoRR, (2023)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed.

Herbert Rowland

Rowland Onyenali

Rowland Lassen

Rowland Enyinnaya Eruba

Rowland Nii-Adjei Otchwemah

Другие публикации лиц с тем же именем

Distributional Reinforcement Learning with Quantile Regression.W. Dabney, M. Rowland, M. Bellemare, и R. Munos. CoRR, (2017)Adaptive Trade-Offs in Off-Policy Learning.M. Rowland, W. Dabney, и R. Munos. AISTATS, том 108 из Proceedings of Machine Learning Research, стр. 34-44. PMLR, (2020)Conditional Importance Sampling for Off-Policy Learning.M. Rowland, A. Harutyunyan, H. van Hasselt, D. Borsa, T. Schaul, R. Munos, и W. Dabney. AISTATS, том 108 из Proceedings of Machine Learning Research, стр. 45-55. PMLR, (2020)Human Alignment of Large Language Models through Online Preference Optimisation.D. Calandriello, D. Guo, R. Munos, M. Rowland, Y. Tang, B. Pires, P. Richemond, C. Lan, M. Valko, T. Liu и 3 other автор(ы). CoRR, (2024)Nash Learning from Human Feedback.R. Munos, M. Valko, D. Calandriello, M. Azar, M. Rowland, Z. Guo, Y. Tang, M. Geist, T. Mesnard, A. Michi и 7 other автор(ы). CoRR, (2023)Meta-learning of Sequential Strategies.P. Ortega, J. Wang, M. Rowland, T. Genewein, Z. Kurth-Nelson, R. Pascanu, N. Heess, J. Veness, A. Pritzel, P. Sprechmann и 14 other автор(ы). CoRR, (2019)MICo: Learning improved representations via sampling-based state similarity for Markov decision processes.P. Castro, T. Kastner, P. Panangaden, и M. Rowland. CoRR, (2021)Quantile Credit Assignment.T. Mesnard, W. Chen, A. Saade, Y. Tang, M. Rowland, T. Weber, C. Lyle, A. Gruslys, M. Valko, W. Dabney и 3 other автор(ы). ICML, том 202 из Proceedings of Machine Learning Research, стр. 24517-24531. PMLR, (2023)Learning Dynamics and Generalization in Deep Reinforcement Learning.C. Lyle, M. Rowland, W. Dabney, M. Kwiatkowska, и Y. Gal. ICML, том 162 из Proceedings of Machine Learning Research, стр. 14560-14581. PMLR, (2022)Taylor Expansion of Discount Factors.Y. Tang, M. Rowland, R. Munos, и M. Valko. ICML, том 139 из Proceedings of Machine Learning Research, стр. 10130-10140. PMLR, (2021)

Что такое BibSonomy?: С чего начать; Кнопки для браузера; Помощь
Разработчикам: Обзор; API-документация

Контакт и защита личных данных: о нас; Cookies; Сообщить о проблеме; BibSonomy Вики

Интеграция: PUMA; Расширение для TYPO3; Плагин для; Клиент Java REST; Поддерживаемые источники; далее

О BibSonomy: Команда; Блог; Список рассылки
Социальные сети: Наш Twitter