Author of the publication

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning.

R. Fruit, M. Pirotta, A. Lazaric, and R. Ortner. ICML, volume 80 of Proceedings of Machine Learning Research, page 1573-1581. PMLR, (2018)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

Gerhard Ortner

Erich Ortner

Birgit Ortner

Lorelies Ortner

August Ortner

Other publications of authors with the same name

Regret bounds for restless Markov bandits.R. Ortner, D. Ryabko, P. Auer, and R. Munos. Theor. Comput. Sci., (2014)Online Regret Bounds for Markov Decision Processes with Deterministic Transitions.R. Ortner. ALT, volume 5254 of Lecture Notes in Computer Science, page 123-137. Springer, (2008)Exploiting Similarity Information in Reinforcement Learning - Similarity Models for Multi-Armed Bandits and MDPs.R. Ortner. ICAART (1), page 203-210. INSTICC Press, (2010)Variational Regret Bounds for Reinforcement Learning.R. Ortner, P. Gajane, and P. Auer. UAI, volume 115 of Proceedings of Machine Learning Research, page 81-90. AUAI Press, (2019)Regret Bounds for Learning State Representations in Reinforcement Learning.R. Ortner, M. Pirotta, A. Lazaric, R. Fruit, and O. Maillard. NeurIPS, page 12717-12727. (2019)Improved Rates for the Stochastic Continuum-Armed Bandit Problem.P. Auer, R. Ortner, and C. Szepesvári. COLT, volume 4539 of Lecture Notes in Computer Science, page 454-468. Springer, (2007)Pseudometrics for State Aggregation in Average Reward Markov Decision Processes.R. Ortner. ALT, volume 4754 of Lecture Notes in Computer Science, page 373-387. Springer, (2007)Online Regret Bounds for Undiscounted Continuous Reinforcement LearningR. Ortner, and D. Ryabko. CoRR, (2013)Pareto Front Identification from Stochastic Bandit Feedback.P. Auer, C. Chiang, R. Ortner, and M. Drugan. AISTATS, volume 51 of JMLR Workshop and Conference Proceedings, page 939-947. JMLR.org, (2016)Achieving Optimal Dynamic Regret for Non-stationary Bandits without Prior Information.P. Auer, Y. Chen, P. Gajane, C. Lee, H. Luo, R. Ortner, and C. Wei. COLT, volume 99 of Proceedings of Machine Learning Research, page 159-163. PMLR, (2019)

BibSonomy

Disambiguation of "Ortner, Ronald"

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning.

Please choose a person to relate this publication to

Gerhard Ortner

Erich Ortner

Birgit Ortner

Lorelies Ortner

August Ortner

Other publications of authors with the same name

Disambiguation

BibSonomy

Disambiguation of "Ortner, Ronald"

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning.

Please choose a person to relate this publication to

Gerhard Ortner

Erich Ortner

Birgit Ortner

Lorelies Ortner

August Ortner

Other publications of authors with the same name

Disambiguation

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning.