Feature Reinforcement Learning: Part I. unstructured MDPs
General-purpose, intelligent, learning agents cycle through sequences of observations, actions, and rewards that are complex, uncertain, unknown, and non-Markovian. On the other hand, reinforcement learning is well-developed for small finite state Markov decision processes (MDPs). Up to now, extracting the right state representations out of bare observations, that is, reducing the general agent setup to the MDP framework, is an art that involves significant effort by designers. The...[Show more]
|Collections||ANU Research Publications|
|Source:||Journal of Artificial General Intelligence|
|Hutter Feature Reinforcement Learning 2009.pdf||404.5 kB||Adobe PDF|
Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.