copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

First-Order Regret Analysis of Thompson Sampling

S. Bubeck, and M. Sellke. (2019)cite arxiv:1902.00681Comment: 27 pages.

Abstract

We address online combinatorial optimization when the player has a prior over the adversary's sequence of losses. In this framework, Russo and Van Roy proposed an information-theoretic analysis of Thompson Sampling based on the information ratio, resulting in optimal worst-case regret bounds. In this paper we introduce three novel ideas to this line of work. First we propose a new quantity, the scale-sensitive information ratio, which allows us to obtain more refined first-order regret bounds (i.e., bounds of the form $L^*$ where $L^*$ is the loss of the best combinatorial action). Second we replace the entropy over combinatorial actions by a coordinate entropy, which allows us to obtain the first optimal worst-case bound for Thompson Sampling in the combinatorial setting. Finally, we introduce a novel link between Bayesian agents and frequentist confidence intervals. Combining these ideas we show that the classical multi-armed bandit first-order regret bound $O(d L^*)$ still holds true in the more challenging and more general semi-bandit scenario. This latter result improves the previous state of the art bound $O((d+m^3)L^*)$ by Lykouris, Sridharan and Tardos.

Description

[1902.00681] First-Order Regret Analysis of Thompson Sampling

Links and resources

BibTeX key: bubeck2019firstorder
entry type: inproceedings
year: 2019
url: http://arxiv.org/abs/1902.00681
note: cite arxiv:1902.00681Comment: 27 pages

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

First-Order Regret Analysis of Thompson Sampling

Abstract

Description

Links and resources

Tags

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML First-Order Regret Analysis of Thompson Sampling

Abstract

Description

Links and resources

Tags

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

First-Order Regret Analysis of Thompson Sampling

Comments and Reviews
(0)