Non-trivial two-armed partial-monitoring games are bandits

Zusammenfassung

We consider online learning in partial-monitoring games against an oblivious adversary. We show that when the number of actions available to the learner is two and the game is nontrivial then it is reducible to a bandit-like game and thus the minimax regret is Theta(T^1/2).

BibTeX-Schlüssel: AnBaSze11
Eintragstyp: article
Jahr: 2011
Zeitschrift: CoRR
Band: abs/1108.4961
ee: http://arxiv.org/abs/1108.4961
date-added: 2012-06-03 14:04:57 -0600
pdf: papers/twoarmed.pdf
bibsource: DBLP, http://dblp.uni-trier.de
date-modified: 2012-06-06 21:29:55 -0600

Nutzer

Kommentare und Rezensionenanzeigen / verbergen

Bitte melden Sie sich an um selbst Rezensionen oder Kommentare zu erstellen.

BibSonomy

Non-trivial two-armed partial-monitoring games are bandits

Zusammenfassung

Tags

Nutzer

Kommentare und Rezensionenanzeigen / verbergen

Zitieren Sie diese Publikation

Mehr Zitationsstile

Suchen auf