Gradient based algorithms with loss functions and kernels for improved on-policy control
Robards, Matthew; Sunehag, Peter
We introduce and empirically evaluate two novel online gradient-based reinforcement learning algorithms with function approximation - one model based, and the other model free. These algorithms come with the possibility of having non-squared loss functions which is novel in reinforcement learning, and seems to come with empirical advantages. We further extend a previous gradient based algorithm to the case of full control, by using generalized policy iteration. Theoretical properties of these...[Show more]
|Collections||ANU Research Publications|
|Source:||Lecture Notes in Computer Science (LNCS)|
|01_Robards_Gradient_based_algorithms_with_2012.pdf||872.92 kB||Adobe PDF||Request a copy|
Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.
Updated: 17 November 2022/ Responsible Officer: University Librarian/ Page Contact: Library Systems & Web Coordinator