Benutzer: Gast  Login
Titel:

Bootstrapped Gradient Temporal-Difference Learning

Dokumenttyp:
Konferenzbeitrag
Art des Konferenzbeitrags:
Textbeitrag / Aufsatz
Autor(en):
Meyer, Dominik; Knopp, Martin; Shen, Hao
Abstract:
In this work we aim at providing a overview on gradient based temporal difference learning methods in reinforcement learning. We will look at three different cost functions, the mean squared Bellman error, the mean squared projected Bellman error and the norm of the expected update. Finally we will derive two new on-line gradient algorithms for TD learning, that base on the idea of bootstrapping.
Stichworte:
Reinforcement Learning (RL); Stochastic Gradient Descent; Bootstrapping; Gradient Temporal-Difference (GTD)
Kongress- / Buchtitel:
DGRTage 2013
Jahr:
2013
Quartal:
4. Quartal
Jahr / Monat:
2014-10
Monat:
Oct
Seiten:
2
Reviewed:
ja
Sprache:
en
 BibTeX