Gradient based algorithms with loss functions and kernels for improved on-policy control
-
Altmetric Citations
Robards, Matthew; Sunehag, Peter
Description
We introduce and empirically evaluate two novel online gradient-based reinforcement learning algorithms with function approximation - one model based, and the other model free. These algorithms come with the possibility of having non-squared loss functions which is novel in reinforcement learning, and seems to come with empirical advantages. We further extend a previous gradient based algorithm to the case of full control, by using generalized policy iteration. Theoretical properties of these...[Show more]
dc.contributor.author | Robards, Matthew | |
---|---|---|
dc.contributor.author | Sunehag, Peter | |
dc.date.accessioned | 2015-12-10T23:32:32Z | |
dc.identifier.issn | 0302-9743 | |
dc.identifier.uri | http://hdl.handle.net/1885/68876 | |
dc.description.abstract | We introduce and empirically evaluate two novel online gradient-based reinforcement learning algorithms with function approximation - one model based, and the other model free. These algorithms come with the possibility of having non-squared loss functions which is novel in reinforcement learning, and seems to come with empirical advantages. We further extend a previous gradient based algorithm to the case of full control, by using generalized policy iteration. Theoretical properties of these algorithms are studied in a companion paper. | |
dc.publisher | Springer | |
dc.source | Lecture Notes in Computer Science (LNCS) | |
dc.subject | Keywords: Full control; Function approximation; Gradient based; Gradient based algorithm; Loss functions; Model free; Model-based OPC; Policy iteration; Learning algorithms; Reinforcement learning | |
dc.title | Gradient based algorithms with loss functions and kernels for improved on-policy control | |
dc.type | Journal article | |
local.description.notes | Imported from ARIES | |
local.description.refereed | Yes | |
local.identifier.citationvolume | 7188 | |
dc.date.issued | 2012 | |
local.identifier.absfor | 080101 - Adaptive Agents and Intelligent Robotics | |
local.identifier.ariespublication | f5625xPUB1854 | |
local.type.status | Published Version | |
local.contributor.affiliation | Robards, Matthew, College of Engineering and Computer Science, ANU | |
local.contributor.affiliation | Sunehag, Peter, College of Engineering and Computer Science, ANU | |
local.description.embargo | 2037-12-31 | |
local.bibliographicCitation.startpage | 30 | |
local.bibliographicCitation.lastpage | 41 | |
local.identifier.doi | 10.1007/978-3-642-29946-9_7 | |
local.identifier.absseo | 970108 - Expanding Knowledge in the Information and Computing Sciences | |
dc.date.updated | 2016-02-24T08:51:23Z | |
local.identifier.scopusID | 2-s2.0-84861701646 | |
Collections | ANU Research Publications |
Download
File | Description | Size | Format | Image |
---|---|---|---|---|
01_Robards_Gradient_based_algorithms_with_2012.pdf | 872.92 kB | Adobe PDF | Request a copy |
Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.
Updated: 17 November 2022/ Responsible Officer: University Librarian/ Page Contact: Library Systems & Web Coordinator