Gradient based algorithms with loss functions and kernels for improved on-policy control

Date

2012

Authors

Robards, Matthew
Sunehag, Peter

Journal Title

Journal ISSN

Volume Title

Publisher

Springer

Abstract

We introduce and empirically evaluate two novel online gradient-based reinforcement learning algorithms with function approximation - one model based, and the other model free. These algorithms come with the possibility of having non-squared loss functions which is novel in reinforcement learning, and seems to come with empirical advantages. We further extend a previous gradient based algorithm to the case of full control, by using generalized policy iteration. Theoretical properties of these algorithms are studied in a companion paper.

Description

Keywords

Keywords: Full control; Function approximation; Gradient based; Gradient based algorithm; Loss functions; Model free; Model-based OPC; Policy iteration; Learning algorithms; Reinforcement learning

Citation

Source

Lecture Notes in Computer Science (LNCS)

Type

Journal article

Book Title

Entity type

Access Statement

License Rights

Restricted until

2037-12-31