Author of the publication

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Boosted Bellman Residual Minimization Handling Expert Demonstrations.

B. Piot, M. Geist, and O. Pietquin. ECML/PKDD (2), volume 8725 of Lecture Notes in Computer Science, page 549-564. Springer, (2014)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

Matthias Piot

Salma Bilal

Erol Bilali

Suphi Bilâl

Rabah Bilal

Other publications of authors with the same name

Boosted and reward-regularized classification for apprenticeship learning.B. Piot, M. Geist, and O. Pietquin. AAMAS, page 1249-1256. IFAAMAS/ACM, (2014)The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement LearningA. Gruslys, W. Dabney, M. Azar, B. Piot, M. Bellemare, and R. Munos. ICLR, (2017)cite arxiv:1704.04651.Rainbow: Combining Improvements in Deep Reinforcement LearningM. Hessel, J. Modayil, H. van Hasselt, T. Schaul, G. Ostrovski, W. Dabney, D. Horgan, B. Piot, M. Azar, and D. Silver. (2017)cite arxiv:1710.02298Comment: Under review as a conference paper at AAAI 2018.Rainbow: Combining Improvements in Deep Reinforcement Learning.M. Hessel, J. Modayil, H. van Hasselt, T. Schaul, G. Ostrovski, W. Dabney, D. Horgan, B. Piot, M. Azar, and D. Silver. AAAI, page 3215-3222. AAAI Press, (2018)End-to-end optimization of goal-driven and visually grounded dialogue systems.F. Strub, H. de Vries, J. Mary, B. Piot, A. Courville, and O. Pietquin. IJCAI, page 2765-2771. ijcai.org, (2017)Understanding Self-Predictive Learning for Reinforcement Learning.Y. Tang, Z. Guo, P. Richemond, B. Pires, Y. Chandak, R. Munos, M. Rowland, M. Azar, C. Lan, C. Lyle and 6 other author(s). ICML, volume 202 of Proceedings of Machine Learning Research, page 33632-33656. PMLR, (2023)Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning.J. Grill, F. Strub, F. Altché, C. Tallec, P. Richemond, E. Buchatskaya, C. Doersch, B. Pires, Z. Guo, M. Azar and 4 other author(s). NeurIPS, (2020)Difference of Convex Functions Programming for Reinforcement Learning.B. Piot, M. Geist, and O. Pietquin. NIPS, page 2519-2527. (2014)Learning Nash Equilibrium for General-Sum Markov Games from Batch Data.J. Pérolat, F. Strub, B. Piot, and O. Pietquin. AISTATS, volume 54 of Proceedings of Machine Learning Research, page 232-241. PMLR, (2017)Neural Predictive Belief Representations.Z. Guo, M. Azar, B. Piot, B. Pires, T. Pohlen, and R. Munos. CoRR, (2018)

BibSonomy

Disambiguation of "Piot, Bilal"

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Boosted Bellman Residual Minimization Handling Expert Demonstrations.

Please choose a person to relate this publication to

Matthias Piot

Salma Bilal

Erol Bilali

Suphi Bilâl

Rabah Bilal

Other publications of authors with the same name

Disambiguation

BibSonomy

Disambiguation of "Piot, Bilal"

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Boosted Bellman Residual Minimization Handling Expert Demonstrations.

Please choose a person to relate this publication to

Matthias Piot

Salma Bilal

Erol Bilali

Suphi Bilâl

Rabah Bilal

Other publications of authors with the same name

Disambiguation

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Boosted Bellman Residual Minimization Handling Expert Demonstrations.