Gradient based algorithms with loss functions and kernels for improved on-policy control
-
Altmetric Citations
Robards, Matthew; Sunehag, Peter
Description
We introduce and empirically evaluate two novel online gradient-based reinforcement learning algorithms with function approximation - one model based, and the other model free. These algorithms come with the possibility of having non-squared loss functions which is novel in reinforcement learning, and seems to come with empirical advantages. We further extend a previous gradient based algorithm to the case of full control, by using generalized policy iteration. Theoretical properties of these...[Show more]
Collections | ANU Research Publications |
---|---|
Date published: | 2012 |
Type: | Journal article |
URI: | http://hdl.handle.net/1885/68876 |
Source: | Lecture Notes in Computer Science (LNCS) |
DOI: | 10.1007/978-3-642-29946-9_7 |
Download
File | Description | Size | Format | Image |
---|---|---|---|---|
01_Robards_Gradient_based_algorithms_with_2012.pdf | 872.92 kB | Adobe PDF | Request a copy |
Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.
Updated: 17 November 2022/ Responsible Officer: University Librarian/ Page Contact: Library Systems & Web Coordinator