Skip navigation
Skip navigation

Gradient based algorithms with loss functions and kernels for improved on-policy control

Robards, Matthew; Sunehag, Peter

Description

We introduce and empirically evaluate two novel online gradient-based reinforcement learning algorithms with function approximation - one model based, and the other model free. These algorithms come with the possibility of having non-squared loss functions which is novel in reinforcement learning, and seems to come with empirical advantages. We further extend a previous gradient based algorithm to the case of full control, by using generalized policy iteration. Theoretical properties of these...[Show more]

CollectionsANU Research Publications
Date published: 2012
Type: Journal article
URI: http://hdl.handle.net/1885/68876
Source: Lecture Notes in Computer Science (LNCS)
DOI: 10.1007/978-3-642-29946-9_7

Download

File Description SizeFormat Image
01_Robards_Gradient_based_algorithms_with_2012.pdf872.92 kBAdobe PDF    Request a copy


Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.

Updated:  20 July 2017/ Responsible Officer:  University Librarian/ Page Contact:  Library Systems & Web Coordinator