Skip navigation
Skip navigation

Nonparametric General Reinforcement Learning

Leike, Jan


Reinforcement learning problems are often phrased in terms of Markov decision processes (MDPs). In this thesis we go beyond MDPs and consider reinforcement learning in environments that are non-Markovian, non-ergodic and only partially observable. Our focus is not on practical algorithms, but rather on the fundamental underlying problems: How do we balance exploration and exploitation? How do we explore optimally? When is an agent optimal? We follow the...[Show more]

CollectionsOpen Access Theses
Date published: 2016
Type: Thesis (PhD)
DOI: 10.25911/5d76346c2e2be


File Description SizeFormat Image
Leike Thesis 2016.pdf1.42 MBAdobe PDFThumbnail

Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.

Updated:  19 May 2020/ Responsible Officer:  University Librarian/ Page Contact:  Library Systems & Web Coordinator