Skip navigation
Skip navigation

Extreme State Aggregation beyond MDPs

Hutter, Marcus

Description

We consider a Reinforcement Learning setup without any (esp. MDP) assumptions on the environment. State aggregation and more generally feature reinforcement learning is concerned with mapping histories/raw-states to reduced/aggregated states. The idea behind both is that the resulting reduced process (approximately) forms a small stationary finite-state MDP, which can then be efficiently solved or learnt. We considerably generalize existing aggregation results by showing that even if the...[Show more]

dc.contributor.authorHutter, Marcus
dc.date.accessioned2015-08-12T05:59:24Z
dc.date.available2015-08-12T05:59:24Z
dc.identifier.isbn978-3-319-11661-7
dc.identifier.issn0302-9743
dc.identifier.urihttp://hdl.handle.net/1885/14699
dc.description.abstractWe consider a Reinforcement Learning setup without any (esp. MDP) assumptions on the environment. State aggregation and more generally feature reinforcement learning is concerned with mapping histories/raw-states to reduced/aggregated states. The idea behind both is that the resulting reduced process (approximately) forms a small stationary finite-state MDP, which can then be efficiently solved or learnt. We considerably generalize existing aggregation results by showing that even if the reduced process is not an MDP, the (q-)value functions and (optimal) policies of an associated MDP with same state-space size solve the original problem, as long as the solution can approximately be represented as a function of the reduced states. This implies an upper bound on the required state space size that holds uniformly for all RL problems. It may also explain why RL algorithms designed for MDPs sometimes perform well beyond MDPs.
dc.publisherSpringer Verlag
dc.relation.ispartofAlgorithmic Learning Theory: 25th International Conference, ALT 2014, Bled, Slovenia, October 8-10, 2014. Proceedings
dc.rights© 2014 Springer International Publishing Switzerland
dc.titleExtreme State Aggregation beyond MDPs
dc.typeConference paper
local.identifier.citationvolume8776
dc.date.issued2014-10
local.publisher.urlhttp://link.springer.com/
local.type.statusAccepted Version
local.contributor.affiliationHutter, M., Research School of Computer Science, The Australian National University
dc.relationhttp://purl.org/au-research/grants/arc/DP120100950
local.bibliographicCitation.startpage185
local.bibliographicCitation.lastpage199
local.identifier.doi10.1007/978-3-319-11662-4_14
dcterms.accessRightsOpen Access
dc.provenancehttp://www.sherpa.ac.uk/romeo/issn/0302-9743/..."Author's post-print on any open access repository after 12 months after publication" from SHERPA/RoMEO site (as at 12/08/15)
CollectionsANU Research Publications

Download

File Description SizeFormat Image
Hutter Extreme State Aggregation 2014.pdf204.89 kBAdobe PDFThumbnail


Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.

Updated:  19 May 2020/ Responsible Officer:  University Librarian/ Page Contact:  Library Systems & Web Coordinator