Misc,

A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning

M. Lanctot, V. Zambaldi, A. Gruslys, A. Lazaridou, K. Tuyls, J. Perolat, D. Silver, and T. Graepel.
(2017)cite arxiv:1711.00832Comment: Camera-ready copy of NIPS 2017 paper, including appendix.

Abstract

To achieve general intelligence, agents must learn how to interact with others in a shared environment: this is the challenge of multiagent reinforcement learning (MARL). The simplest form is independent reinforcement learning (InRL), where each agent treats its experience as part of its (non-stationary) environment. In this paper, we first observe that policies learned using InRL can overfit to the other agents' policies during training, failing to sufficiently generalize during execution. We introduce a new metric, joint-policy correlation, to quantify this effect. We describe an algorithm for general MARL, based on approximate best responses to mixtures of policies generated using deep reinforcement learning, and empirical game-theoretic analysis to compute meta-strategies for policy selection. The algorithm generalizes previous ones such as InRL, iterated best response, double oracle, and fictitious play. Then, we present a scalable implementation which reduces the memory requirement using decoupled meta-solvers. Finally, we demonstrate the generality of the resulting policies in two partially observable settings: gridworld coordination games and poker.

BibTeX key: lanctot2017unified
entry type: misc
year: 2017
url: http://arxiv.org/abs/1711.00832
note: cite arxiv:1711.00832Comment: Camera-ready copy of NIPS 2017 paper, including appendix

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

@misc{lanctot2017unified, abstract = {To achieve general intelligence, agents must learn how to interact with others in a shared environment: this is the challenge of multiagent reinforcement learning (MARL). The simplest form is independent reinforcement learning (InRL), where each agent treats its experience as part of its (non-stationary) environment. In this paper, we first observe that policies learned using InRL can overfit to the other agents' policies during training, failing to sufficiently generalize during execution. We introduce a new metric, joint-policy correlation, to quantify this effect. We describe an algorithm for general MARL, based on approximate best responses to mixtures of policies generated using deep reinforcement learning, and empirical game-theoretic analysis to compute meta-strategies for policy selection. The algorithm generalizes previous ones such as InRL, iterated best response, double oracle, and fictitious play. Then, we present a scalable implementation which reduces the memory requirement using decoupled meta-solvers. Finally, we demonstrate the generality of the resulting policies in two partially observable settings: gridworld coordination games and poker.}, added-at = {2017-11-24T16:34:03.000+0100}, author = {Lanctot, Marc and Zambaldi, Vinicius and Gruslys, Audrunas and Lazaridou, Angeliki and Tuyls, Karl and Perolat, Julien and Silver, David and Graepel, Thore}, biburl = {https://www.bibsonomy.org/bibtex/261ce06ad5e8db140407969dfbc6745f3/achakraborty}, description = {[1711.00832] A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning}, interhash = {181f6c92fc9c0ed047db8e1943620b73}, intrahash = {61ce06ad5e8db140407969dfbc6745f3}, keywords = {2017 deep-learning google reinforcement-learning}, note = {cite arxiv:1711.00832Comment: Camera-ready copy of NIPS 2017 paper, including appendix}, timestamp = {2017-11-24T16:34:03.000+0100}, title = {A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning}, url = {http://arxiv.org/abs/1711.00832}, year = 2017 }

BibSonomy

A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on