копировать удалить добавить публикацию в буфер
Запись сообщества
посмотреть историю данной записи
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods

G. Neu, и {. Szepesvári. UAI, стр. 295--302. (2007)

Аннотация

In this paper we propose a novel gradient algorithm to learn a policy from an expert's observed behavior assuming that the expert behaves optimally with respect to some unknown reward function of a Markovian Decision Problem. The algorithm's aim is to find a reward function such that the resulting optimal policy matches well the expert's observed behavior. The main difficulty is that the mapping from the parameters to policies is both nonsmooth and highly redundant. Resorting to subdifferentials solves the first difficulty, while the second one is overcome by computing natural gradients. We tested the proposed method in two artificial domains and found it to be more reliable and efficient than some previous methods.

Линки и ресурсы

ключ BibTeX: neu2007
тип записи: inproceedings
название книги: UAI
год: 2007
страницы: 295--302
pdf: papers/uai2007-irl.pdf
date-modified: 2010-11-25 00:54:55 -0700
date-added: 2010-08-28 17:38:14 -0600

тэги

@csaba- тэги данного пользователя выделены

Цитировать эту публикацию

искать в

Метаданные

Последнее изменение 4 лет назад
Создан 4 лет назад

Комментарии и рецензии
(0)

Комментарии, или рецензии отсутствуют. Вы можете их написать!