Gradient based algorithms with loss functions and kernels for improved on-policy control
Date
2012
Authors
Robards, Matthew
Sunehag, Peter
Journal Title
Journal ISSN
Volume Title
Publisher
Springer
Abstract
We introduce and empirically evaluate two novel online gradient-based reinforcement learning algorithms with function approximation - one model based, and the other model free. These algorithms come with the possibility of having non-squared loss functions which is novel in reinforcement learning, and seems to come with empirical advantages. We further extend a previous gradient based algorithm to the case of full control, by using generalized policy iteration. Theoretical properties of these algorithms are studied in a companion paper.
Description
Keywords
Keywords: Full control; Function approximation; Gradient based; Gradient based algorithm; Loss functions; Model free; Model-based OPC; Policy iteration; Learning algorithms; Reinforcement learning
Citation
Collections
Source
Lecture Notes in Computer Science (LNCS)
Type
Journal article
Book Title
Entity type
Access Statement
License Rights
Restricted until
2037-12-31